Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

including long character literal in code

72 views
Skip to first unread message

Paul Anton Letnes

unread,
Aug 13, 2012, 4:50:34 AM8/13/12
to
Hi,

I'm using embedded Lua to read my input files, and want to run some code
by the interpreter, mostly defining helper functions for the input file
writer (a file written by a human) to use.

So the Fortran question is: Is there a good and clean way to include
long character literals (aka strings) in the compiled code? I'm thinking
perhaps 50-100 lines of Lua code, some to be run before running the
input file to the simulation code, and some to be run after the input
file. I want the files to be included in the statically linked binary,
simply for convenience.

Paul

glen herrmannsfeldt

unread,
Aug 13, 2012, 5:13:44 AM8/13/12
to
Paul Anton Letnes <paul.ant...@nospam.gmail.kthxbai.com> wrote:

> I'm using embedded Lua to read my input files, and want to run some code
> by the interpreter, mostly defining helper functions for the input file
> writer (a file written by a human) to use.

> So the Fortran question is: Is there a good and clean way to include
> long character literals (aka strings) in the compiled code?
> I'm thinking perhaps 50-100 lines of Lua code, some to be
> run before running the input file to the simulation code,
> and some to be run after the input file.

I am not sure by now what the continuation line limit is.

> I want the files to be included in the statically linked binary,
> simply for convenience.

For static data, you could use EQUIVALENCE to keep data contiguous
in memory, even if declared and initialized separately.
I presume a large array of CHARACTER data, maybe one element for
each line of Lua source.

-- glen

Paul Anton Letnes

unread,
Aug 13, 2012, 5:33:57 AM8/13/12
to
Performance is not an issue, the files are in the kB range, so I
wouldn't worry about 'contiguous in memory' (I think?). I'm more
interested in how you would go about entering the string into Fortran
code without reading the string from a file at runtime. Is there some
trick you can do with e.g. 'include' that will run at compile-time?

By the way, Lua source lines can be separated by ';', for example:
a = 7; b = 9;
Perhaps that will get around line continuation restrictions.

Paul

Wolfgang Kilian

unread,
Aug 13, 2012, 5:50:38 AM8/13/12
to
On 08/13/2012 11:33 AM, Paul Anton Letnes wrote:
> On 13.08.12 11:13, glen herrmannsfeldt wrote:
>> Paul Anton Letnes<paul.ant...@nospam.gmail.kthxbai.com> wrote:
>>
>>> I'm using embedded Lua to read my input files, and want to run some code
>>> by the interpreter, mostly defining helper functions for the input file
>>> writer (a file written by a human) to use.
>>
>>> So the Fortran question is: Is there a good and clean way to include
>>> long character literals (aka strings) in the compiled code?
>>> I'm thinking perhaps 50-100 lines of Lua code, some to be
>>> run before running the input file to the simulation code,
>>> and some to be run after the input file.

You can concatenate strings at compile time, see:

program long_text

integer, parameter :: l = 8, lines = 2, rep = 1000

character(l), parameter :: a001 = "abcdefgh"
character(l), parameter :: a002 = "lmnopqrs"

character(l*lines), parameter :: a = a001 // a002

character(rep*l*lines), parameter :: a_long = repeat (a, rep)

print *, a

print *, len (a_long)

print *, a_long

end program long_text

The limit for 'len' is related to the line length and continuation
limits, but the limit for 'lines' or 'rep' is set only by the length of
character string parameters that the compiler can handle, I guess.

Such code can easily be generated automatically by a simple script.

> By the way, Lua source lines can be separated by ';', for example:
> a = 7; b = 9;
> Perhaps that will get around line continuation restrictions.

You may embed newline characters if needed.

>
> Paul
>

-- Wolfgang


--
E-mail: firstnameini...@domain.de
Domain: yahoo

Wolfgang Kilian

unread,
Aug 13, 2012, 5:52:04 AM8/13/12
to
> The limit for 'len'

Typo: I mean 'l' here.

Terence

unread,
Aug 13, 2012, 7:02:31 AM8/13/12
to
I would think, arrange to read the desired long literal text strings from
an exterior text file into a declared character array. This way the text
strings can be in any chosen language.
I've been doing that since 1972 (but limited by F77 to 64k bytes arrays by
one still-used/loved early compiler, but not by a later F90 one) and it
works well for all European languages that are based on single-byte
characters. I'm sure the same can work for two-byte character tables too
like non-western script and ideographic languages.




Paul Anton Letnes

unread,
Aug 13, 2012, 7:13:01 AM8/13/12
to
This is interesting. I could write the text file as-is, use e.g. the
unix 'split' command to split it into, say, 100 character chunks, write
a small script to put those chunks into a .f90 module, and finally 'use'
the module. I'll certainly think about it - wish it was a bit less
involved, though.

Paul

Paul Anton Letnes

unread,
Aug 13, 2012, 7:14:35 AM8/13/12
to
I see, the files would only contain a few function calls and
assignments; nothing complicated. I'm pretty sure one-byte characters
will suffice. How do you practically go about doing this at compile-time?

Paul

Paweł Biernat

unread,
Aug 13, 2012, 7:20:44 AM8/13/12
to
There is a pretty neat way of doing this in C:
http://burtonini.com/blog/computers/ld-blobs-2007-07-13-15-50

I am trying to make it work with Fortran using iso_c_binding but there are some memory alignment problems and I have no knowledge to resolve them. Maybe someone more resourceful will be able to help with it.

Regards,
Paweł Biernat.

dpb

unread,
Aug 13, 2012, 9:06:12 AM8/13/12
to
On 8/13/2012 6:14 AM, Paul Anton Letnes wrote:
> On 13.08.12 13:02, Terence wrote:
>> I would think, arrange to read the desired long literal text strings from
>> an exterior text file into a declared character array. ...
>
...

> ... How do you practically go about doing this at compile-time?

That's a runtime solution w/ the facility to modify the literals w/o
recompilation that Terence described.

I guess I don't see the issue much in just initializing a character
array using whatever source code generation mechanism you wish...but I
don't quite follow what you're doing w/ the data precisely so perhaps
the problem is you want a data area that isn't actually Fortran data?

--

Tobias Burnus

unread,
Aug 13, 2012, 9:34:34 AM8/13/12
to
On 08/13/2012 01:20 PM, Paweł Biernat wrote:
> W dniu poniedziałek, 13 sierpnia 2012 10:50:34 UTC+2 użytkownik Paul Anton Letnes napisał:
>> So the Fortran question is: Is there a good and clean way to include
>>
>> long character literals (aka strings) in the compiled code? I'm thinking
>> perhaps 50-100 lines of Lua code, some to be run before running the
>> input file to the simulation code, and some to be run after the input
>> file. I want the files to be included in the statically linked binary,
>> simply for convenience.

Well, you could simply use something like:

character(len=*), parameter :: str = 'Hel&
lo Wor&
&ld!'
print *, str
end

That is: A single string literal. To be conforming, the lines should be
maximally 132 characters (or 72 in fixed form) long. I think Fortran <=
95 limits the number of continuation lines to 19 (fixed form) and 39
(free form source code), which was extended to 255 in Fortran 2003 and
later. I think many compilers happily accept more, some (like gfortran
with -std=f95/f2003) warn, and others simply reject (without having a
special flag).

Note: The example above uses a Fortran 2008 feature: the "len=*". In
Fortran 95/2003, you need to calculate yourself the number of characters.


> There is a pretty neat way of doing this in C:
> http://burtonini.com/blog/computers/ld-blobs-2007-07-13-15-50
>
> I am trying to make it work with Fortran using iso_c_binding but there are some memory alignment problems and I have no knowledge to resolve them. Maybe someone more resourceful will be able to help with it.

The following not very elegant code works here. I do get alignment
warnings, but one can ignore them.

I also don't get an scalar character string but an array of len=1
strings. One can surely make the code nicer. A brute force method would
be "cstr_start(1:len)" but that's not really nicer. One could also think
about copying the data to a scalar character string of the proper
length, but that duplicates the data.

Tobias


$ echo 'Hello Fortran World!' > foo.txt

$ ld -r -b binary -o foo.o foo.txt

$ gfortran -g fortran.f90 foo.o && ./a.out
/usr/bin/ld: Warning: alignment 1 of symbol `_binary_foo_txt_end' in
foo.o is smaller than 8 in /tmp/ccX0oFRi.o
/usr/bin/ld: Warning: alignment 1 of symbol `_binary_foo_txt_start' in
foo.o is smaller than 8 in /tmp/ccX0oFRi.o
22
Hello Fortran World!

$ cat fortran.f90
module m
use iso_c_binding
implicit none
character(c_char), bind(C, name='_binary_foo_txt_start'), target ::
cstr_start
character(c_char), bind(C, name='_binary_foo_txt_end'), target :: cstr_end
end module m

use m
implicit none
character(len=1, kind=c_char), pointer :: str(:)
type(c_ptr) :: cptr
integer(c_intptr_t) :: len

len =transfer(c_loc(cstr_end),len) - transfer(c_loc(cstr_start),len) + 1
print *, len
call c_f_pointer (c_loc (cstr_start), str, shape=[len])
print '(*(a))', str
end

Paweł Biernat

unread,
Aug 13, 2012, 9:37:49 AM8/13/12
to
I created a simple example of how to apply the mentioned data embedding strategy into executable files:

https://gist.github.com/3340727

to build and run use
FC=gfortran make

I am far from understanding what exactly is going on in the assembler file (I found it somewhere on the web), but it seems to be working just fine.

Paweł Biernat.

James Van Buskirk

unread,
Aug 13, 2012, 12:11:56 PM8/13/12
to
"Tobias Burnus" <bur...@net-b.de> wrote in message
news:5029026...@net-b.de...

> character(c_char), bind(C, name='_binary_foo_txt_start'), target ::
> cstr_start

I wish that compilers let you scatter the KIND numbers so that
intrinsic variables of different types couldn't have the same
KINDs (except for LOGICAL and INTEGER) and the above construct
could be detected.

If the code is going to be automatically generated, then DATA
statements could do the trick:

C:\gfortran\clf\long_string>type long_string.f90
module ls
implicit none
character(29) la
data la(1:10) /'Write(*,*)'/
data la(11:25) /'"Hello, world!"'/
data la(26:29) /';end'/
end module ls

program test
use ls
implicit none
write(*,'(a)') la
end program test

C:\gfortran\clf\long_string>gfortran long_string.f90 -olong_string

C:\gfortran\clf\long_string>long_string
Write(*,*)"Hello, world!";end

For Windows programs, there is rc.exe or windres.exe that can
embed just about anything in an executable.

http://msdn.microsoft.com/en-us/library/windows/desktop/aa381054(v=vs.85).aspx

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


Ian Harvey

unread,
Aug 13, 2012, 2:34:09 PM8/13/12
to
On 2012-08-13 11:34 PM, Tobias Burnus wrote:
...
>
> Well, you could simply use something like:
>
> character(len=*), parameter :: str = 'Hel&
> lo Wor&
> &ld!'
...
>
> Note: The example above uses a Fortran 2008 feature: the "len=*". In
> Fortran 95/2003, you need to calculate yourself the number of characters.

Here be the classic symptoms of someone that has been spending too much
time trying to make Fortran and C all happy together - they get all
confused about which century they are in when it comes to language features.

William Clodius

unread,
Aug 13, 2012, 9:01:52 PM8/13/12
to
Tobias Burnus <bur...@net-b.de> wrote:

> On 08/13/2012 01:20 PM, Pawe? Biernat wrote:
> > W dniu poniedzia?ek, 13 sierpnia 2012 10:50:34 UTC+2 u?ytkownik
> > Paul Anton Letnes napisa?:
> >> So the Fortran question is: Is there a good and clean way to include
> >>
> >> long character literals (aka strings) in the compiled code? I'm thinking
> >> perhaps 50-100 lines of Lua code, some to be run before running the
> >> input file to the simulation code, and some to be run after the input
> >> file. I want the files to be included in the statically linked binary,
> >> simply for convenience.
>
> Well, you could simply use something like:
>
> character(len=*), parameter :: str = 'Hel&
> lo Wor&
> &ld!'
> print *, str
> end
> <snip>

Shouldn't that be

character(len=*), parameter :: str = 'Hel&
&lo Wor&
&ld!'
print *, str
end

i.e. I think you were missing an ampersand. Myself I don't like the odd
behavor of ampersands used as continuation within strings and prefer

character(len=*), parameter :: str = 'Hel' // &
'lo Wor' // &
'ld!'
print *, str
end

glen herrmannsfeldt

unread,
Aug 13, 2012, 9:45:28 PM8/13/12
to
William Clodius <wclo...@earthlink.net> wrote:
> Tobias Burnus <bur...@net-b.de> wrote:

(snip)
>> character(len=*), parameter :: str = 'Hel&
>> lo Wor&
>> &ld!'
>> print *, str
>> end
>> <snip>

> Shouldn't that be

> character(len=*), parameter :: str = 'Hel&
> &lo Wor&
> &ld!'
> print *, str
> end

> i.e. I think you were missing an ampersand.

Seems so:

"If a noncharacter context is to be continued, an .&. shall
be the last nonblank character on the line, or the last
nonblank character before an .!.. There shall be a later
line that is not a comment; the statement is continued on
the next such line. If the first nonblank character on that
line is an .&., the statement continues at the next character
position following that .&.; otherwise, it continues with
the first character position of that line."

Note above, that the continuation line doesn't need an &, and that
the continuation starts in the first non-blank isn't an &,
then it continues starting at column 1. Extra blanks between
tokens, though, don't have any effect.

"If a lexical token is split across the end of a line, the
first nonblank character on the first following noncomment
line shall be an .&. immediately followed by the successive
characters of the split token."

In this case, the leading & is required. Still, following the
rules above, and starting with column 1 would give the right
effect if the continued token started in column 1.


"If a character context is to
be continued, an .&. shall be the last nonblank character on
the line and shall not be followed by commentary. There shall
be a later line that is not a comment; an .&. shall be the
first nonblank character on the next such line and the
statement continues with the next character following that .&..

Again, if the first non-blank isn't &, and the compiler treats
the continuation as if it started in column 1, then nothing bad
happens. It isn't standard, but it can work as an extension.

-- glen
Message has been deleted

Louisa

unread,
Aug 14, 2012, 12:49:00 AM8/14/12
to
On Aug 13, 6:50 pm, Paul Anton Letnes
Long strings can be incorporated by appending '&' to each line except
the last, and by prefixing each line with '&' except for the first.
Each line would need to be kept to a convenient
size, say, 80 to 100 characters (and in any case, less than the
maximum source line length).

A better way might be to read in the string from a file.

Ron Shepard

unread,
Aug 14, 2012, 1:00:52 AM8/14/12
to
In article <5029026...@net-b.de>,
Tobias Burnus <bur...@net-b.de> wrote:

> Well, you could simply use something like:
>
> character(len=*), parameter :: str = 'Hel&
> lo Wor&
> &ld!'
> print *, str
> end
[...]
> Note: The example above uses a Fortran 2008 feature: the "len=*". In
> Fortran 95/2003, you need to calculate yourself the number of characters.

Actually, you can define character constants without counting
characters all the way back to f77 where character variables were
originally introduced in the standard.

character*(*) lower
parameter ( lower = 'abcdefghijklmnopqrstuvwxyz' )

You had to do it on two lines, but that is typical of f77 syntax.
In f90 and later, you could put all the attributes on one line.

$.02 -Ron Shepard

Louisa

unread,
Aug 14, 2012, 1:14:47 AM8/14/12
to
On Aug 14, 3:00 pm, Ron Shepard <ron-shep...@NOSPAM.comcast.net>
wrote:
> In article <5029026A.80...@net-b.de>,
implicit none
character*(*) lower
parameter ( lower = 'abcdefghijklmnopqrstuvwxyz' )

character(len=*), parameter :: upper = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

print *, lower
print *, upper
end

Tobias Burnus

unread,
Aug 14, 2012, 2:14:06 AM8/14/12
to
Ron Shepard wrote:
> In article <5029026...@net-b.de>,
>> Note: The example above uses a Fortran 2008 feature: the "len=*". In
>> Fortran 95/2003, you need to calculate yourself the number of characters.
>
> Actually, you can define character constants without counting
> characters all the way back to f77 where character variables were
> originally introduced in the standard.

Sorry, I mixed that up with arrays. For arrays, using * is new in
Fortran 2008:

character(len=*), parameter :: array(*) = [ "abc", "def", "ghi" ]
end

Thanks for correcting my statement.

Tobias

Louisa

unread,
Aug 14, 2012, 1:04:25 AM8/14/12
to
On Aug 14, 11:45 am, glen herrmannsfeldt <g...@ugcs.caltech.edu>
wrote:

>   "If a character context is to
>    be continued, an .&. shall be the last nonblank character on
>    the line and shall not be followed by commentary. There shall
>    be a later line that is not a comment; an .&. shall be the
>    first nonblank character on the next such line and the
>    statement continues with the next character following that .&..
>
> Again, if the first non-blank isn't &, and the compiler treats
> the continuation as if it started in column 1,

No it doesn't.

> then nothing bad happens.

Compiler issues an error message, and terminates (fatal error).

It's not clear why the rule was written that way.
Could it be ambiguous if the .&. were omitted?

glen herrmannsfeldt

unread,
Aug 14, 2012, 3:41:18 AM8/14/12
to
Louisa <louisa...@gmail.com> wrote:

(snip, I wrote)
>> Again, if the first non-blank isn't &, and the compiler treats
>> the continuation as if it started in column 1,

> No it doesn't.

>> then nothing bad happens.

> Compiler issues an error message, and terminates (fatal error).

I never tried it, and certainly not on all compilers.

> It's not clear why the rule was written that way.
> Could it be ambiguous if the .&. were omitted?

It is ambiguous if the next character is &, but then it isn't,
as there is an & in column 1.

-- glen

Louisa

unread,
Aug 14, 2012, 4:08:46 AM8/14/12
to
On Aug 14, 5:41 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
If an .&. weren't required to start the next line,
then your instance would not be ambiguous.

Thus, the question remains.

Paul Anton Letnes

unread,
Aug 14, 2012, 5:46:35 AM8/14/12
to

> A better way might be to read in the string from a file.
>

I agree wholeheartedly. However, as I said, I'd prefer (for convenience)
to keep everything in the precompiled binary. This has more to do with
what is convenient for distributing the program than anything else.

Paul


Louisa

unread,
Aug 14, 2012, 7:16:01 AM8/14/12
to
On Aug 14, 7:46 pm, Paul Anton Letnes
Perhaps a neater way of including the string in the code
is to treat each line as a separate string,
and to store each line as an element of an array.

A long string, continued over many lines, may run into a hard
limit on the number of continued lines. The standard requires
a minimum of 255 (continuation) lines.

I see that you have fewer lines than 255, but you didn't say how
long those lines were.

If you want the LUA code as a single string, you could have
each line as a separate string, and join them all
(or batches of them) using //.
That can be done in an initialization, as (I think)
someone suggested already, in a [character, parameter] statement.

Then, for really long strings, any separate variables can be joined.

Louisa

unread,
Aug 14, 2012, 7:24:27 AM8/14/12
to
On Aug 14, 7:46 pm, Paul Anton Letnes
A possible treatment according to my suggestion is:

character (len=26), parameter :: s1 = 'abcdefghijklmnopqrstuvwxyz'
character (len=26), parameter :: s2 = '12345678901234567890123456'
character (len=52), parameter :: s3 = s1 // s2

Terence

unread,
Aug 14, 2012, 6:13:31 PM8/14/12
to

On 13.08.12 13:02, Terence wrote:
>> I would think, arrange to read the desired long literal text strings
from
>> an exterior text file into a declared character array. This way the text
>> strings can be in any chosen language.
>> I've been doing that since 1972 (but limited by F77 to 64k bytes arrays
by
>> one still-used/loved early compiler, but not by a later F90 one) and it
>> works well for all European languages that are based on single-byte
>> characters. I'm sure the same can work for two-byte character tables too
>> like non-western script and ideographic languages.
>

Paul responds:

>I see, the files would only contain a few function calls and
>assignments; nothing complicated. I'm pretty sure one-byte characters
>will suffice. How do you practically go about doing this at compile-time?

The point is that literal strings are always a form of text, even if used as
a Format statement. ('literal' and 'literature' hav ethe same root).

Text strings are always about communicating, and the communication content
should always be outside the program (until the day all humanity only uses
one single language). So the literals should not be compiled into the code,
just that the SPACE for the communication messages, as variables, should be
reserved at compile time (I do this in the simplest form with a set of
messages and the length of the each string indexed by the message number; a
more genreal case is to use a block of text with pointers to, and lengths of
the messages, indexed by their cardinal number).

I am constantly horrified by the problems posted here, as being the resulst
of, either not approaching propely the algorithm to be used, or the
complexity of the structuring chosen when implementing the algorithm.
And you all wonder why I use a simple compiler by preference, when I have
alternatives?
I prefer to only deal with one problem at coding time, and that is the
correctness of the algorithm.
.. Ah well, I suspect it's a learning curve...




Terence

unread,
Aug 14, 2012, 6:22:44 PM8/14/12
to
Dont' you see what you've all being doing? Squabling over line lengths and
ampersands and code implementations. This is letting particulars of the
Fortran language standard's syntax dictate your coding approach.

Step back and take a long view.

The OP's questions was really 'how do you get long lines of text to be
produced by the program at run time, and where did this text come from and
where and how should it be stored? (and not asked, but especially, what if
it might need modifying at some point?).



Paul Anton Letnes

unread,
Aug 15, 2012, 3:53:38 AM8/15/12
to
Exactly, spot on. I'd prefer a file 'myscript.lua' that I could include
in my code at compile-time as
character(*), parameter :: code = loadfile('myscript.lua')
in a pseudocode kind of way; I hope you get the idea.

Paul.

dpb

unread,
Aug 15, 2012, 9:18:22 AM8/15/12
to
On 8/15/2012 2:53 AM, Paul Anton Letnes wrote:
...

> ... I'd prefer a file 'myscript.lua' that I could include
> in my code at compile-time as
> character(*), parameter :: code = loadfile('myscript.lua')
> in a pseudocode kind of way; I hope you get the idea.

Basically

INCLUDE

or a preprocessor, then...

--

0 new messages