Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Character constants in formatted read

254 views
Skip to first unread message

nshaffer

unread,
Apr 19, 2019, 5:02:15 PM4/19/19
to
I recently learned that character constants are disallowed in the format of a read statement.
------------------------------------------------
$ cat birthday.f90
program birthday
implicit none

integer :: d, m, y

write (*, '(a)', advance='no') "Please enter your date of birth in dd/mm/yyyy format: "
read (*, '(i2,"/",i2,"/",i4)') d, m, y
write (*,'(3(a,i0,:,/))') "Day = ", d, "Month = ", m, "Year = ", y
end program birthday

$ gfortran birthday.f90
$ ./a.out
Please enter your date of birth in dd/mm/yyyy format: 01/01/1965
At line 7 of file birthday.f90 (unit = 5, file = 'stdin')
Fortran runtime error: Constant string in input format
(i2,"/",i2,"/",i4)
^
------------------------------------------------
It's not obvious to me why this should be disallowed. It seems like it could be very powerful. I have in mind, for instance, the potential to use the format '(a,'=',d20)' to parse a string like 'foo = 42 ' into a character variable 'foo ' and an integer 42 with a single read statement.

Can anyone educate me on why character constants are not (should not?) be permitted in the format of a read statement?

(I am aware that some compilers permit it but just treat the character in the format string as a blank.)

steve kargl

unread,
Apr 19, 2019, 5:10:24 PM4/19/19
to
nshaffer wrote:

> It's not obvious to me why this should be disallowed. It seems like it could
> be very powerful. I have in mind, for instance, the potential to use the format
> '(a,'=',d20)' to parse a string like 'foo = 42 ' into a character variable 'foo '
> and an integer 42 with a single read statement.

What you have described is called NAMELIST in the Fortran standard.

--
steve


FortranFan

unread,
Apr 19, 2019, 6:16:06 PM4/19/19
to
On Friday, April 19, 2019 at 5:02:15 PM UTC-4, nshaffer wrote:

> ..
>
> Can anyone educate me on why character constants are not (should not?) be permitted in the format of a read statement?
> ..

Because it is superfluous?!

You may know the Fortran standard permits X editing and TR editing e.g., in section 13 of Fortran 2018:

13.8.1.3 X editing
1 The nX edit descriptor indicates that the transmission of the next
character to or from a record is to occur at the character position n
characters forward from the current position.

NOTE 1
An nX edit descriptor has the same effect as a TRn edit descriptor.

So you can do the following:
read (*, '(i2,1x,i2,1x,i4)') d, m, y

which has the same effect as what appear to be seeking in your code in the original post.

You may know the Fortran standard allows character constants in output editing:

--- begin example ---
integer :: d, m, y

write (*, '(a)', advance='no') "Please enter your date of birth in dd/mm/yyyy format: "
read (*, '(i2,1x,i2,1x,i4)') d, m, y
write (*,'("Day = ",i0,", Month = ",i0,", Year = ",i0)') d, m, y

end
--- end example ---

Upon execution,
Please enter your date of birth in dd/mm/yyyy format: 04/19/2019
Day = 4, Month = 19, Year = 2019

As explained upthread, for your needs " to parse a string like 'foo = 42 ' into a character variable 'foo ' and an integer 42 ", look into NAMELIST, a maligned but useful facility in Fortran:

--- begin example ---
integer :: d, m, y
namelist / bday / d, m, y
character(len=:), allocatable :: s
write (*, '(a)', advance='no') "Please enter your date of birth in dd/mm/yyyy format: "
read (*, '(i2,1x,i2,1x,i4)') d, m, y
write (*,nml=bday)
s = "&bday d=15, m=4, y=1957/"
read( s, nml=bday )
write(*,'("Fortran''s Birthday: ")')
write (*,'("Day = ",i0,", Month = ",i0,", Year = ",i0)') d, m, y

end
--- end example ---

Upon execution,

Please enter your date of birth in dd/mm/yyyy format: 04/19/2019
&BDAY
D=4 ,
M=19 ,
Y=2019 ,
/
Fortran's Birthday:
Day = 15, Month = 4, Year = 1957


ga...@u.washington.edu

unread,
Apr 20, 2019, 1:22:00 AM4/20/19
to
On Friday, April 19, 2019 at 2:02:15 PM UTC-7, nshaffer wrote:
> I recently learned that character constants are disallowed
> in the format of a read statement.

I haven't looked lately, but in Fortran 66 H format descriptors
were allowed with READ statements, but I suspect they don't do what
you expect.

The read-in characters at that part of the record replace the
characters in the FORMAT. If that same FORMAT is used with
a WRITE statement, the read-in characters get written out.

This was mostly not so useful with most systems, as you would
usually want a carriage control character at the beginning of the
output line, while not expecting it on the input line.

I never knew anyone to use this feature. I don't know when
it got removed.

That is with actual FORMAT statements, I am not sure about variable
formats (Fortran 66 style) or character constant formats
(later versions).

ga...@u.washington.edu

unread,
Apr 20, 2019, 1:36:44 AM4/20/19
to
On Friday, April 19, 2019 at 10:22:00 PM UTC-7, ga...@u.washington.edu wrote:

(snip, I wrote)

> I haven't looked lately, but in Fortran 66 H format descriptors
> were allowed with READ statements, but I suspect they don't do what
> you expect.

> The read-in characters at that part of the record replace the
> characters in the FORMAT. If that same FORMAT is used with
> a WRITE statement, the read-in characters get written out.

It seems that this went away in Fortran 77.

Fortran 77 has CHARACTER variables, which are a better way to read
in and write out data. Fortran 66 has A format descriptor to read
into and write out of numeric variables or arrays.

robin....@gmail.com

unread,
Apr 20, 2019, 4:11:08 AM4/20/19
to
In a READ statement, you are taking data from the input device.

The format specification should contain only format items that
specify the layout of the values to be read in.

On output, the proper place for any data to be written out
is in the WRITE statement, not in the FORMAT statement or
format specification. For historical reasons, character data
is still permitted in the format specification.

Ron Shepard

unread,
Apr 20, 2019, 10:11:00 AM4/20/19
to
On 4/20/19 3:11 AM, robin....@gmail.com wrote:
> On output, the proper place for any data to be written out
> is in the WRITE statement, not in the FORMAT statement or
> format specification. For historical reasons, character data
> is still permitted in the format specification.

I do not know of anything in the fortran standards over the years or any
interpretations by the standards committee that would support this
statement. One is free to have that opinion, of course, but I don't
agree with it and I expect that a good fraction of fortran programmers
would also disagree.

Here is an example of both approaches:

write(*,'(a,i0)') 'k=', k
write(*,'("k=",i0)') k

Both approaches are valid, and there are situations in which either
approach might be preferred over the other. There are even situations
where one might want to write a string (e.g. 'k='), into the format, and
then later use that format to write out the corresponding data (e.g. k).

$.02 -Ron Shepard

robin....@gmail.com

unread,
Apr 20, 2019, 11:26:31 AM4/20/19
to
On Sunday, April 21, 2019 at 12:11:00 AM UTC+10, Ron Shepard wrote:
> On 4/20/19 3:11 AM, r......@gmail.com wrote:
> > On output, the proper place for any data to be written out
> > is in the WRITE statement, not in the FORMAT statement or
> > format specification. For historical reasons, character data
> > is still permitted in the format specification.
>
> I do not know of anything in the fortran standards over the years or any
> interpretations by the standards committee that would support this
> statement. One is free to have that opinion, of course, but I don't
> agree with it and I expect that a good fraction of fortran programmers
> would also disagree.
>
> Here is an example of both approaches:
>
> write(*,'(a,i0)') 'k=', k
> write(*,'("k=",i0)') k

You just illustrated what I just said above.

I merely pointed out that a string in a format statement is not
a layout specification, it's data.

> Both approaches are valid, and there are situations in which either
> approach might be preferred over the other. There are even situations
> where one might want to write a string (e.g. 'k='), into the format, and
> then later use that format to write out the corresponding data (e.g. k).

You can deal with that better in the WRITE data list.

ga...@u.washington.edu

unread,
Apr 21, 2019, 3:03:26 AM4/21/19
to
I don't know of anything in the standard, but if you use apostrophes
for both, the need to double them, and quadruple them if you want to
print one, discourages some people.

I suspect that people who started in Fortran 66 days tend to
use them, and others tend not to.

Ron Shepard

unread,
Apr 21, 2019, 1:09:47 PM4/21/19
to
This is true, but since f90 it is legal to mix single and double quotes,
as I did above in my example, and that eliminates some of the
complications. Also, this issue is not specific to format statements, it
applies to character strings in general.

> I suspect that people who started in Fortran 66 days tend to
> use them, and others tend not to.

Yes, this issue arose with f77 when quoted character strings were
introduced into the language. Before that, there were only Hollerith
strings, which, quirky as they are in other ways, did not have a quote
delimiter that needed to be escaped.

As for putting strings within formats in general, I tend to do it when
the argument list in the write statement gets too long, or when the same
format, with the same embedded strings, is used for many write
statements. This way, when a change is made to the format, it only needs
to be done in one place rather than scattered across many write statements.

Also, before advance='no' was added to the write statement, I sometimes
created format strings that had embeded data in it that were then passed
through several levels of logic to create the final format for the final
output record. I've found that advance='no' can now be used to simplify
many of these situations.

$.02 -Ron Shepard

dpb

unread,
Apr 21, 2019, 4:19:02 PM4/21/19
to
On 4/21/2019 12:09 PM, Ron Shepard wrote:
> On 4/21/19 2:03 AM, ga...@u.washington.edu wrote:...

>
> This is true, but since f90 it is legal to mix single and double quotes,
> as I did above in my example, and that eliminates some of the
> complications. Also, this issue is not specific to format statements, it
> applies to character strings in general.

+1
It doesn't matter where the string is, that's an issue. The mixing of
quoting delimiters is, indeed, a major "syntactic sugar" win...

...

> As for putting strings within formats in general, I tend to do it when
> the argument list in the write statement gets too long, or when the same
> format, with the same embedded strings, is used for many write
> statements. This way, when a change is made to the format, it only needs
> to be done in one place rather than scattered across many write statements.

+134

That's the strongest argument of all imo--there's no point in putting
constant data in output strings when it can be encapsulated at compile
time in one place.

If there is some outside chance of this every changing, then write a
variable that is initialized somewhere and again make the change only in
one place instead of all over...


> Also, before advance='no' was added to the write statement, I sometimes
> created format strings that had embeded data in it that were then passed
> through several levels of logic to create the final format for the final
> output record. I've found that advance='no' can now be used to simplify
> many of these situations.
>
> $.02 -Ron Shepard

and imo, ymmv, etc., etc., etc., ...

--dpb

urba...@comcast.net

unread,
Apr 22, 2019, 1:46:45 AM4/22/19
to
On Friday, April 19, 2019 at 5:02:15 PM UTC-4, nshaffer wrote:
I took the original post to be asking if Fortran has any
facility similar to the family of scan(3c) functions in
C. For example:

#include <stdlib.h>
#include <stdio.h>
main(){
int month,day,year;
int err;
char string[]="birthday 12/ 5/1990";
err=sscanf(string,"birthday %i/%i/%i",&month,&day,&year);
printf("STATUS %i\n",err);
printf("MONTH=%i DAY=%i YEAR=%i\n",month,day,year);
}
STATUS 3
MONTH=12 DAY=5 YEAR=1990

Solutions have been given for the specific examples posted, but for the
more general question posed the answer is that Fortran does not have real
equivalents for the scan(3c) functions, nor other C-based tools used
for similar general parsing problems (primarily strtok, yacc/bison,
and regular expressions).

Regular expressions can be called from Fortran thru the ISO C Binding,
and are a very powerful tool for parsing strings.

The FLIBS collection of Fortran utilities includes LEMON, which is
similiar to yacc(1) or bison(1), for parsing grammars ( parsing
"NAME=VALUE" would be an easy one for LEMON).

There are a number of simple token routines available on the WWW that
can be used to solve many simple parsing problems. See the M_strings(3f)
module on the Fortran Wiki site and/or other references there for some
examples.

In addition to Fortran NAMELIST note that there is a module for reading JSON files on GitHub if you are looking for other methods of reading KEYWORD=VALUE
files.

ga...@u.washington.edu

unread,
Apr 22, 2019, 4:45:03 AM4/22/19
to
On Sunday, April 21, 2019 at 10:46:45 PM UTC-7, urba...@comcast.net wrote:
> On Friday, April 19, 2019 at 5:02:15 PM UTC-4, nshaffer wrote:
> > I recently learned that character constants are disallowed in
> > the format of a read statement.

(snip)

> I took the original post to be asking if Fortran has any
> facility similar to the family of scan(3c) functions in
> C. For example:

> #include <stdlib.h>
> #include <stdio.h>
> main(){
> int month,day,year;
> int err;
> char string[]="birthday 12/ 5/1990";
> err=sscanf(string,"birthday %i/%i/%i",&month,&day,&year);
> printf("STATUS %i\n",err);
> printf("MONTH=%i DAY=%i YEAR=%i\n",month,day,year);
> }
> STATUS 3
> MONTH=12 DAY=5 YEAR=1990

It is not all that useful in scanf() itself, but as you note
sometimes useful with sscanf().

However, it is mostly useful in that sscanf() will stop without
error when it finds something that doesn't match what is expected,
and guarantees that the previously assigned variables are assigned.

I have done:

i=j=0;
sscanf("%d:%d", &i, &j);
seconds = 60*i+j;

this well accept a number of minutes, or minutes:seconds
(slightly more obvious is to check the return code from sscanf(),
and allow for seconds or minutes:seconds, but it seems that isn't
how I did it in this one use for this feature).

But even if fortran allowed this, it doesn't guarantee to assign
values to variables if there is any error in the input.

Might just as well use 1X to ignore the characters.

Could use %*c in C to ignore one character, too.





nshaffer

unread,
Apr 23, 2019, 11:37:57 AM4/23/19
to
Indeed, this is what I was angling at with my question. I had forgotten that the "scan" family of from the C standard library did this. It's slightly inconvenient that there's no direct analogue in Fortran, but I suppose I can live with rolling my own simple parsers or using the modules shared by kind users such as yourself and Arjen.

For some reason, this "feels" like the kind of thing a formatted read should handle, though! Oh, well.

ga...@u.washington.edu

unread,
Apr 25, 2019, 3:14:02 AM4/25/19
to
On Tuesday, April 23, 2019 at 8:37:57 AM UTC-7, nshaffer wrote:

(snip)

> Indeed, this is what I was angling at with my question.

> I had forgotten that the "scan" family of from the C standard
> library did this. It's slightly inconvenient that there's no direct
> analogue in Fortran, but I suppose I can live with rolling my own
> simple parsers or using the modules shared by kind users such
> as yourself and Arjen.

C's printf and scanf work different from Fortran READ/WRITE
in a lot of ways.

For scanf, the characters used by a format descriptor stop
when a character that doesn't apply is found. If you:

scanf("%d",&i);

it will skip blanks until it finds digits, process the digits,
and if, for example, a comma or slash is in the input, stop
and leave that for whatever comes next.

Fortran will give a conversion error, and not (guarantee to) assign
any variables.

Fortran formatted input was designed in the punched card days.
It works very well for fields in specific card columns. It was
usual not to include separators like comma or slash in input cards.

It was somewhat common not to include decimal points, with an implied
decimal between specified card columns. For production systems,
cards would be printed with the fields marked on the card.

While cards were still popular in the early C days, terminal input
was common enough, so C input was designed to work well that way.
You can give specific column widths, but the scanf defaults work
well with blank delimited fields.

The DEC compilers (and probably others) for terminal based time-sharing
system modify the Fortran format rules. Similar to C, conversion will
end when the field ends, with a result somewhat like list-directed input.

For both Fortran and C, it is often best to read in an input
line, and then process it appropriately.

As someone previously mentioned, one way (either Fortran or C) is
the use a regular-expression parser to separate the data fields,
and process accordingly.

dpb

unread,
Apr 25, 2019, 9:31:08 AM4/25/19
to
...

But it fails miserably for fixed-width fields without every field being
populated as was often used since Fortran will count blanks while C just
"eats" white space.

Also, the chosen arrangement of the C formatting string is terrible,
eliminating the possibility of repeat counts leading to abominations
like '%f%f'%f%f'%f%f'%f%f'%f%f'%f%f%s'%f%f'%f%f'%f%f'%f%f'

--

ga...@u.washington.edu

unread,
Apr 25, 2019, 4:06:05 PM4/25/19
to
On Thursday, April 25, 2019 at 6:31:08 AM UTC-7, dpb wrote:

(snip, I wrote)

> > Fortran formatted input was designed in the punched card days.
> > It works very well for fields in specific card columns. It was
> > usual not to include separators like comma or slash in input cards.

> > It was somewhat common not to include decimal points, with an implied
> > decimal between specified card columns. For production systems,
> > cards would be printed with the fields marked on the card.

> > While cards were still popular in the early C days, terminal input
> > was common enough, so C input was designed to work well that way.
> > You can give specific column widths, but the scanf defaults work
> > well with blank delimited fields.
> ...

> But it fails miserably for fixed-width fields without every field being
> populated as was often used since Fortran will count blanks while C just
> "eats" white space.

You can put a maximum width on C format descriptors, so they
will read input data with no blanks in between. I am not sure
about cases with blanks and input characters.

In Fortran 66 days, I would often leave a field blank when
the value was zero, knowing it would count as zero.

In some cases, C won't store a value, so the variables should
be given an initial value first.

> Also, the chosen arrangement of the C formatting string is terrible,
> eliminating the possibility of repeat counts leading to abominations
> like '%f%f'%f%f'%f%f'%f%f'%f%f'%f%f%s'%f%f'%f%f'%f%f'%f%f'

Yes, but in the most common case, reading an array, it is
done with a loop.

for(i=0;i<n;i++) scanf("%f", x+i);

Note that if EOF is reached, all actually read values are stored.

In Fortran 77 (VAX) days, I had something like:

READ(5,*, END=999) (X(I), I=1,N)

to read an unknown number of input values, and expecting I to
be one more than the number of values read. It seems that Fortran
does not require a specific value for I, or for the elements of X
that would be read in.

I believe this is when I first knew about an actual Fortran
standard, as I had thought about sending in a bug report to DEC,
and was then told about the actual rule for such input.

In the above C case, I might do:

for(i=0; i<n; i++) if(scanf("%f", x+i) < 1) break;

which will exit the loop with i one past the index of the
last element read, or in this case, the number of elements read.

I suspect that a loop like this in Fortran, with ADVANCE='NO'
should work, but it wasn't so easy in the Fortran 77 days.


urba...@comcast.net

unread,
Apr 25, 2019, 7:35:03 PM4/25/19
to
Intuitively, it does seem that with advance='no' and size= on a READ(3f) that there would be a very similar equivalent Fortran construct but since there is no field of arbitrary length like %f on a READ when using non-advancing I/O (you cannot use * or g0, you have to give an explicit length for the input field) I cannot think of anything as simple as the C construct. If anyone can think of one I'd be interested in seeing it. About the closest I could think of

is in the Fortran Wiki at

http://fortranwiki.org/fortran/show/getvals

just using standard Fortran statements and not using a parsing routine.



ga...@u.washington.edu

unread,
Apr 25, 2019, 8:53:26 PM4/25/19
to
On Thursday, April 25, 2019 at 4:35:03 PM UTC-7, urba...@comcast.net wrote:

(snip, I wrote)

> > In Fortran 77 (VAX) days, I had something like:
> >
> > READ(5,*, END=999) (X(I), I=1,N)
> >
> > to read an unknown number of input values, and expecting I to
> > be one more than the number of values read. It seems that Fortran
> > does not require a specific value for I, or for the elements of X
> > that would be read in.

(snip)

> > I suspect that a loop like this in Fortran, with ADVANCE='NO'
> > should work, but it wasn't so easy in the Fortran 77 days.

> Intuitively, it does seem that with advance='no' and size= on
> a READ(3f) that there would be a very similar equivalent Fortran
> construct but since there is no field of arbitrary length like %f
> on a READ when using non-advancing I/O (you cannot use * or g0,
> you have to give an explicit length for the input field)
> I cannot think of anything as simple as the C construct.
> If anyone can think of one I'd be interested in seeing it.

Yes, I was expecting that list-directed read should work with
ADVANCE='NO', but it seems not, at least not for the version
of gfortran that I have nearby.

I don't know of a reason for the restriction, as other languages
can do it.

FortranFan

unread,
Apr 26, 2019, 8:59:58 AM4/26/19
to
On Thursday, April 25, 2019 at 7:35:03 PM UTC-4, urba...@comcast.net wrote:

> ..
> Intuitively, it does seem that with advance='no' and size= on a READ(3f) that there would be a very similar equivalent Fortran construct but since there is no field of arbitrary length like %f on a READ when using non-advancing I/O (you cannot use * or g0 ..


@urba...@comcast.net,

First, would you know what's up with the '3f', on your latest comment and also at Fortran Wiki? I'm assuming it's your contribution. Is this something you enter explicitly? If so, do you mean it as a question mark given the ASCII code for that symbol? It's rather distracting.

With generalized editing in Fortran, the field width cannot be zero on input so G0 is clearly inapplicable given the current Fortran standard.

If one considers brevity toward reading in arbitrary input of disparate number intrinsic types, I too think C has a leg up on Fortran but Fortran appears better to me in *certain* other ways including with character data and repeated input, etc.

urba...@comcast.net

unread,
Apr 26, 2019, 6:47:22 PM4/26/19
to
> First, would you know what's up with the '3f', on your latest comment and also at Fortran Wiki? I'm assuming it's your contribution. Is this something you enter explicitly? If so, do you mean it as a question mark given the ASCII code for that symbol? It's rather distracting.
>
Perhaps the convention of using a man page suffix is more of a localism and less of a universal convention than I thought; but it is basically a reflex for me to
add a suffix as used in man page entries that distinguishes between a common term and a computer term, as so many computer terms are also English words. For example: read, read(3f), read(3c), read(1sh) would be the English word "read", the Fortran READ directive, the C read procedure, and the Borne shell command.

That convention is used so commonly in my daily circle I guess I presumed it was a common convention. Is also implies that case might be specific when used in the first word of a sentence. So "read(3c) is the C procedure that …" can be used instead of the rule that a sentence always starts with a capital, for case-sensitive terms.

I guess I spent too many hours reading and writing man pages (which I would usually write as "man(1) pages" :> ).

ga...@u.washington.edu

unread,
Apr 26, 2019, 7:41:32 PM4/26/19
to
On Friday, April 26, 2019 at 3:47:22 PM UTC-7, urba...@comcast.net wrote:
> > First, would you know what's up with the '3f', on your latest
> > comment and also at Fortran Wiki?

(snip)

> Perhaps the convention of using a man page suffix is more of a
> localism and less of a universal convention than I thought;
> but it is basically a reflex for me to add a suffix as used in
> man page entries that distinguishes between a common term and
> a computer term, as so many computer terms are also English words.

Not quite reflex for me, though maybe in some other newsgroups,
but I am used to seeing it.

Some years ago, I used in HP-UX system with a Fortran compiler,
and often enough looked up things that way.

> For example: read, read(3f), read(3c), read(1sh) would be the
> English word "read", the Fortran READ directive, the C read procedure,
> and the Borne shell command.

For those who want to try it, and have such system available:

man 3f read

Borne shell is described in sh(1), but there is a
link from read(1) to builtin(1). The system call read is in read(2).
C library routines are in section 3, not 3c, and the actual
routine is fread. (It is not unusual for C programs to directly
call unix system calls, such as read(2).)

> That convention is used so commonly in my daily circle I guess
> I presumed it was a common convention. Is also implies that case
> might be specific when used in the first word of a sentence.
> So "read(3c) is the C procedure that …" can be used instead of
> the rule that a sentence always starts with a capital,
> for case-sensitive terms.

I do tend to capitalize Fortran keywords in posts, not because
they used to be upper case only, but as a way to distinguish the
keyword from the English word.
0 new messages