help on some design on processing a file

850 views
Skip to first unread message

Pascal

unread,
Oct 7, 2016, 10:46:05 AM10/7/16
to
Hi,

I am processing a file which have about a hundred keywords. Each keyword
would be processed by a particular subroutine.

I am trying to write some code where I could call a subroutine depending
on the content of a variable. Something like this:
call process(keyword, argument)
Where `keyword` would be called with the argument `argument`.

Is there any way to do something like this?

I could do:
SELECT CASE (keyword)
CASE ('keyword1')
call keyword1(argument)
CASE ('keyword2')
call keyword2(argument)
CASE DEFAULT
call error('does not exist')
END SELECT

But that would mean modifying the select case when adding a new keyword.
I am wondering if there is an easier way.

Pascal

vladim...@gmail.com

unread,
Oct 7, 2016, 1:06:36 PM10/7/16
to
> But that would mean modifying the select case when adding a new keyword.
> I am wondering if there is an easier way.

Depends what you call easier, but to have more flexibility you can use a hashtable library and procedure pointers pointing to your subroutines.

Louis Krupp

unread,
Oct 7, 2016, 4:04:24 PM10/7/16
to
On Fri, 7 Oct 2016 15:46:03 +0100, Pascal <pasc...@parois.net>
wrote:
It depends on what you mean by "easier."

If you are (or if you want to be) familiar with dynamically loaded
libraries, you could create a function for each keyword and load it
dynamically and then call it when that keyword appears. You might
have to write a C interface.

If your main program does very little besides calling the appropriate
subroutine for each keyword it sees, you could put each of those
subroutines in its own program and have your main program (which you
can now call a "driver") use execute_command_line() to run the
appropriate program. I believe this is a Fortran 2008 feature, so
your compiler might or might not have it. If it doesn't, there could
be more C code in your future.

As an alternative to that, you could write the main program in a
scripting language like Perl or Python and have it read the input file
and run the appropriate programs, passing each one any data it needs.

Have fun.

Louis

Terence

unread,
Oct 7, 2016, 5:43:27 PM10/7/16
to

"Pascal" wrote
>I am processing a file which have about a hundred keywords. Each keyword
>would be processed by a particular subroutine.

>I am trying to write some code where I could call a subroutine depending on
>the content of a variable. Something like this:
> call process(keyword, argument)
> Where `keyword` would be called with the argument `argument`.

>Is there any way to do something like this?
(snipped)
>But that would mean modifying the select case when adding a new keyword. I
>am wondering if there is an easier way.
Pascal

You must have a list of your keywords either inside or outside your program.
So, it is best if OUTSIDE, since this allows the use of ANY USER language by
just changing the file supplied.
This text file is first read, line by line, and best if each keyword is
preceded by its unique number, since this allows easy removal of any
unwanted words in the future, without changing the program.
So there are two matrices, one has the series of ID numbers and the other
the keywords matching those numbers both in the same index order.
If the external keyword list is arranged in the most probably order, the
speed of internal search is increased, which yields the associated routine
number assigned to the keyword.
The rest is simple programming; get the keyword; find the ID number; call
the required routine by ID number.

Oldster

Richard Maine

unread,
Oct 7, 2016, 6:08:41 PM10/7/16
to
Terence <tbwr...@cantv.net> wrote:

> The rest is simple programming; get the keyword; find the ID number; call
> the required routine by ID number.

Looks like all you've done in the elided part is convert a string to an
index number and thus translated the problem of calling a procedure
based on a string to one of calling a procedure based on a number. That
doesn't seem fundamentally much different. Yes, it is "simple
programing", but so was the original problem. Indeed, the OP showed a
simple solution using select case - simple, but just a bit long-winded.

One could do something like use the number to index into an array of
procedure pointers (using the usual hack to create something akin to an
array of pointers). But the essential part of that is the idea to use
procedure pointers as Vladmir and Louis mentioned. Given that basic
idea, I'd consider the details of how one then indexed such pointers to
be the SMOP (small/simple matter of programming).

Or did you have some other trick in mind when you say "call the required
routine by ID number?" I can't think of any other obvious way to do that
directly in Fortran other than via procedure pointers. There are
certainly approaches that lean heavily on things outside of Fortran.
Previous posters mentioned some. Another would be to build a jump table
in assembly and call that from Fortran. One could probably do something
using type-bound procedures instead of procedure pointers, though that's
going in the direction of more complicated instead of less so. There are
contexts where the type-bound procedure thing could make sense.

--
Richard Maine
email: last name at domain . net
dimnain: summer-triangle

FortranFan

unread,
Oct 7, 2016, 6:35:18 PM10/7/16
to
On Friday, October 7, 2016 at 5:43:27 PM UTC-4, Terence wrote:

> ..
>
> You must have a list of your keywords either inside or outside your program.
> So, it is best if OUTSIDE, since this allows the use of ANY USER language by
> just changing the file supplied.
> ..
> The rest is simple programming; get the keyword; find the ID number; call
> the required routine by ID number.
> ..

@Terence,

This is the second time I've noticed you suggest an approach involving ID number and files and so forth, previous occasion being this:
https://groups.google.com/forum/#!topic/comp.lang.fortran/oa9UdyIR10c

You are clearly quite happy with your technique that has served you well over the years and which you now recommend to others in the context of "string" handling. However from what I can infer from your description, it just comes across as "less easy" (however vaguely one might define it!) than the options available in the language.

Will it be possible for you to show some actual code with a fully worked out example along with your analysis of the benefits of your technique relative to other options within the language, such as with (a possibly long) SELECT CASE construct mentioned in the original post?

You may also want to consider a formal publication in "Scientific Programming" or ACM Fortran Forum, etc. if you feel your technique has broader relevance in computing.

Gordon Sande

unread,
Oct 7, 2016, 7:12:49 PM10/7/16
to
On 2016-10-07 21:43:16 +0000, Terence said:

> The rest is simple programming; get the keyword; find the ID number;
> call the required routine by ID number.

When I read what was requested it sure looked like the "required routine"
would also change, evan a new routine. That is either a small source change
or depending upon dynamic loading and linking. The latter does not seem
like something a beginner should use as a first try. Fine for experienced
"system programmers".

campbel...@gmail.com

unread,
Oct 7, 2016, 11:16:30 PM10/7/16
to
For simplicity and clarity, the original proposal of SELECT CASE looks like a good place to start. I haven't seen any suggestion as to why this is a bad idea.
If there are about 100 keywords, you may wish to first scan the keyword for validity and possibly give the keyword a group number/identifier and so have separate SELECT CASE constructs for each group. This may make the coding a bit easier to maintain.
There is also no discussion of "argument(s)". Managing this may make the SELECT CASE a bit long, which may be helped by groups of keywords, based on their argument building.
You may also wish to first map the keyword to a different name, so that multiple keywords could use the same routine, although the original SELECT CASE approach can easily manage this.

First, use your idea, get it working, then see where it needs improving. It will work.

FJ

unread,
Oct 8, 2016, 4:21:36 AM10/8/16
to
In any way, if you want to add a new keyword associated to a specific
treatment, you will have to change something in your program.

And modifying a "select case" does not seem to be the biggest part of the
modification... In addition, all the alternatives mentioned by other people
seem far more complicated than that !

Stefano Zaghi

unread,
Oct 8, 2016, 4:42:09 AM10/8/16
to
Dear Pascal,

As others have already noticed, your "select case" proposal seems to be the most easy (at least clear/concise) solution. You have not provided many details, thus my understanding could be drammatically wrong, but I argued:

+ each keyword needs its own procedure;
+ unclear if in the file processed you have the "specific-association" 'key1 => procedure_foo', or simply a list of keywords to be processed;

The two cases originating from the second point are quite different, but your code have to implement all procedures anyway.

Let us considering 2 scenario.

SIMPLE LIST OF KEYWORDS
Your input file is a list of keys, e.g.

---input file
# list of keywords to be processed
key2 args_for_proc
key567 args_for_proc
...
---end input file


In this case your select case approach is effective for me. All other pure Fortran approaches (namely excluding non Fortran tricks) are very similar (conceptually) to the select case method, i.e. hashtable, "jump table", factory pattern, strategy pattern... could be considered different "flavors" of the same logic. Your main cons, i.e. the need to update the select case for each new keyword, is not eliminated by the other methods, it is only flushed to other kind of updates (ad a new node to the hash table, provide a new strategy...).

THE INPUT FILE "DEFINE" EACH SPECIFIC KEY-PROCEDURE ASSOCIATION
Your input file is a list of keys WITH the associated procedure to be used, e.g.

---input file
# list of keywords/procedures to be processed
key2 process_foo args_for_proc
key567 process_bar args_for_proc
...
---end input file

In this more complex case I still think that your select case approach is one of the best for easy/clearness. However, in this case more complex approach like factory pattern could make more simple the maintainance/improvment of the code.


Nevertheless, if you want to restrict your self on Fortran, all solutions look like a different flavor of select case logic. As a matter of fact, to my knowneldge, Fortran has not the "introspection" capability of other languages (e.g. Python), thus obtaining an "introspective expansion" like "call key_proc(proc='process_foo', key='key1', args=...)" where the string "process_foo" is "expanded" to call your actually implemented "process_foo" procedure is not possible (except if you make the process_foo a stand-alone program that can be called by F08 sys call).

I agree with Campbell: your select case seems very effective. Anyhow, give use a more detailed view of your aim.

My best regards.


Pascal

unread,
Oct 8, 2016, 6:35:48 AM10/8/16
to
On 07/10/16 15:46, Pascal wrote:
> Hi,
>
> I am processing a file which have about a hundred keywords. Each keyword
> would be processed by a particular subroutine.
>
> [...]

Thank you for all the comments, looks like a select case is the
easiest/simplest approach.
Each keyword is associated to a unique subroutine.

I thought that as you can pass a subroutine as argument, there would be
way to dynamically `convert` a string to a subroutine.

Pascal

JerryD

unread,
Oct 8, 2016, 11:38:48 AM10/8/16
to
One other option not mentioned is using a preprocessor on your source code with
an included definition file that contains the keywords and the corresponding
procedure names and arguments. Macro expansion then builds the code for you.

For myself I would probably try a derived type/class that contains the keyword
and procedure pointer and build an array of these initialized in an included file.

Jerry


herrman...@gmail.com

unread,
Oct 8, 2016, 12:52:13 PM10/8/16
to
On Saturday, October 8, 2016 at 3:35:48 AM UTC-7, Pascal wrote:

(snip)

> Thank you for all the comments, looks like a select case is the
> easiest/simplest approach.
> Each keyword is associated to a unique subroutine.

> I thought that as you can pass a subroutine as argument, there would be
> way to dynamically `convert` a string to a subroutine.

Some interpreted languages allow calling subroutines using
a name in a string variable.

For compiled languages, when you pass a subroutine name as
an argument, it is usually converted to the address of the subroutine
at compile time. In some cases, the actual name is available for
printing out in messages, but not for use in calls.

There are sometimes tricks you can play with dynamic
linking to do something similar, but that is usually outside
the language definition.

-- glen

Louis Krupp

unread,
Oct 8, 2016, 1:33:31 PM10/8/16
to
On Sat, 8 Oct 2016 11:35:46 +0100, Pascal <pasc...@parois.net>
wrote:
There is a way; see:

http://pubs.opengroup.org/onlinepubs/009695399/functions/dlsym.html

and

http://stackoverflow.com/questions/38710099/fortran-dynamic-libraries-load-at-runtime

When you add a new keyword, you could write a new subroutine, compile
it, and add it to the shared object or dynamic link library or
whatever it is on your system. When you read a keyword from your
input file, you could use it to generate the name of a symbol in your
shared object (if you're on Linux, etc) and then you could call
dlsym() to get the address of the subroutine. If dlsym() returns NULL
for a keyword, then the keyword's subroutine isn't in the shared
object and you can tell the user that the keyword hasn't been
implemented.

The question is whether or not this sounds like fun. If it does, then
go for it.

Louis

David Jones

unread,
Oct 9, 2016, 1:55:20 PM10/9/16
to
A more direct approach is to replace the idea of using a "subroutine" with
that of using an individual "program" corresponding to each keyword. That
is, use a system level call to run a program whose name is either exactly
the keyword, or derived from it in a standard way, and then pass
information back in a some fixed way using a file. No need to have a
predefined list of possible keywords or to rewrite the main program for
new ones.

For usability, one might either check for the existence of the program
executable or obtain a list of all the executables in the intended program
directory.

The usefulness of this idea clearly depends on your overall context, and
on how much overhead there is in making a system call of this type.

--
Using Opera's mail client: http://www.opera.com/mail/

herrman...@gmail.com

unread,
Oct 9, 2016, 3:45:11 PM10/9/16
to
On Sunday, October 9, 2016 at 10:55:20 AM UTC-7, David Jones wrote:
> On Sat, 08 Oct 2016 11:35:46 +0100, Pascal <pascal..p@parois.net> wrote:

(snip)

> >> I am processing a file which have about a hundred keywords. Each keyword
> >> would be processed by a particular subroutine.

(snip)

> A more direct approach is to replace the idea of using a "subroutine" with
> that of using an individual "program" corresponding to each keyword. That
> is, use a system level call to run a program whose name is either exactly
> the keyword, or derived from it in a standard way, and then pass
> information back in a some fixed way using a file. No need to have a
> predefined list of possible keywords or to rewrite the main program for
> new ones.

It depends much on details not given.

Another way, especially if the different subroutines aren't so different,
is to write one subroutine leaving important parameters as input data,
supplied by an input file. Often enough, this results in a specialized
language for writing the different routines (optimized for the problem
at hand), a compiler for that language generating some form of
intermediate code, and then the program interprets that code,
selecting the routine, read from a file, based on the input string.

I did one once where the different "routines" were actually
pattern matching problems, and it turned out that they could
be implemented as regular expressions. An input file gave
the appropriate regular expression for each case, and
instructions told users how to add new ones later, if needed.

Terence

unread,
Oct 13, 2016, 12:33:34 AM10/13/16
to
"FortranFan" wrote (about Terence's preferred method)
>Will it be possible for you to show some actual code with a fully worked
>out example along >with your analysis of the benefits of your technique
>relative to other options within the >language, such as with (a possibly
>long) SELECT CASE construct mentioned in the original >post?

I thought it was obvious: the 'simple' programming to locate the routine
that follows acquiring the keyword index number beyond a small range, has as
the fastest way, a binary splitting of the range of the index number into
tests in powers of 2, for sub-range section before actually performing any
CASE determination, which calls the actual subroutine.

I used this in all my software where I needed to offer pretty much any
external single-character based language (latino, Greek or Romaji come to
mind); but the same is applicable to modern multiple-byte symbol
identifications.

In the 'good old days', most Fortran compilers constructed a list of actual
routine addresses,
with a null for unsupported indices, so the routine associated with the
index number was an in-line list of actual routine addresses in index order.
Perhaps before your time?

Mind you, I still prefer a computed GOTO over CASE anyway, since this is
usually compiled as a faster selection.

campbel...@gmail.com

unread,
Oct 13, 2016, 1:07:41 AM10/13/16
to
On Thursday, October 13, 2016 at 3:33:34 PM UTC+11, Terence wrote:
>
> Mind you, I still prefer a computed GOTO over CASE anyway, since this is
> usually compiled as a faster selection.

How true is this claim ? and is the speed requirement of this and other search techniques really necessary ?

The original premise was that a subroutine is available for each keyword, so is a speedy selection of the subroutine the problem ?
Dynamic definition of keywords ? when a specific subroutine is required to process the identified keyword.

I think some of the ideas expressed in this thread are a long way from assisting the original post.

Stefano Zaghi

unread,
Oct 13, 2016, 3:38:12 AM10/13/16
to
Dear Terence,

Il giorno giovedì 13 ottobre 2016 06:33:34 UTC+2, Terence ha scritto:
> I thought it was obvious: the 'simple' programming to locate the routine
> that follows acquiring the keyword index number beyond a small range, has as
> the fastest way, a binary splitting of the range of the index number into
> tests in powers of 2, for sub-range section before actually performing any
> CASE determination, which calls the actual subroutine.

Indeed, I do not understand you. A concrete example, even in pseudo code, applied to the scenario of OP is really welcome to better understand your technique. In particular, I cannot understand in "what" and how it is different from the very concise and clear select case approach.

I agree with campbel that for the OP needs the select case approach looks well suited.

> Mind you, I still prefer a computed GOTO over CASE anyway, since this is
> usually compiled as a faster selection.

Well, I disagree ever on such "sharp" sentences...

Can you provide a concrete benchmark proving that goto is faster than select case (in the OP scenario)?

Moreover, even in the case some benchmarks show that, how big is the speedup?

In my opinion, what really matter when talking about "goto" is all another kind of issues with respect the speed: gotos produce unreadable, non maintainable, spaghetti-code that strongly prevent and limit important features like conciseness and clearness and strongly impact also on speed that is so important for you (preventing optimization, vectorization, multi-threading...).

Nope, nowadays the suggestion to use goto is anachronistic and I think that goto should be considered "bad practice".

My best regards.


FJ

unread,
Oct 13, 2016, 12:28:02 PM10/13/16
to
Le 13/10/2016 à 09:37, Stefano Zaghi a écrit :
> Dear Terence,
>
> Il giorno giovedì 13 ottobre 2016 06:33:34 UTC+2, Terence ha scritto:
>> I thought it was obvious: the 'simple' programming to locate the routine
>> that follows acquiring the keyword index number beyond a small range, has
>> as
>> the fastest way, a binary splitting of the range of the index number into
>> tests in powers of 2, for sub-range section before actually performing any
>>
>> CASE determination, which calls the actual subroutine.
>
> Indeed, I do not understand you. A concrete example, even in pseudo code,
> applied to the scenario of OP is really welcome to better understand your
> technique. In particular, I cannot understand in "what" and how it is
> different from the very concise and clear select case approach.
>
> I agree with campbel that for the OP needs the select case approach looks
> well suited.
>
>> Mind you, I still prefer a computed GOTO over CASE anyway, since this is
>> usually compiled as a faster selection.
>
> Well, I disagree ever on such "sharp" sentences...
>
> Can you provide a concrete benchmark proving that goto is faster than select
> case (in the OP scenario)?

I did such benchmark few years ago ... and Terence is right !

>
> Moreover, even in the case some benchmarks show that, how big is the
> speedup?

Most often, the speedup does not matter. And a "select case" construct is
easier to extend.

>
> In my opinion, what really matter when talking about "goto" is all another
> kind of issues with respect the speed: gotos produce unreadable, non
> maintainable, spaghetti-code that strongly prevent and limit important
> features like conciseness and clearness and strongly impact also on speed that
> is so important for you (preventing optimization, vectorization,
> multi-threading...).

For normal GOTO, your are right but computed goto is a quit clear construct
which looks like a select case despite the high number of labels.

Richard Maine

unread,
Oct 13, 2016, 2:42:53 PM10/13/16
to
FJ <francois.jacq@invalid> wrote:

> Most often, the speedup does not matter. And a "select case" construct is
> easier to extend.

Indeed the OP mentioned nothing about speed, and considering that it is
used to select a procedure to call, it seems unlikely that the speed of
the selection would be significant compared to the overhead of the
subroutine calls unless they were the most trivial of inlineable
subroutines, which doesn't seem likely from the problem description.
Seems to me that speed is likely to be very low on the list of concerns
here, far behind things like clarity, extensibility, etc., which were
the things the OP asked about.

> For normal GOTO, your are right but computed goto is a quit clear construct
> which looks like a select case despite the high number of labels.

Don't forget about cmputed goto being formally obsolescent in the
standard. That might not matter to some people. To others, it can be a
deal-breaker due to policy dictates.

FortranFan

unread,
Oct 13, 2016, 3:46:33 PM10/13/16
to
On Thursday, October 13, 2016 at 12:28:02 PM UTC-4, FJ wrote:

> Le 13/10/2016 à 09:37, Stefano Zaghi a écrit :
> ..
> >
> > Can you provide a concrete benchmark proving that goto is faster than select
> > case (in the OP scenario)?
>
> I did such benchmark few years ago ... and Terence is right !
>
> ..

@FJ,

It'll be nice if you can dig it up or provide some reference for your study because note Terence's point was, "I still prefer a computed GOTO over CASE anyway, since this is usually compiled as a faster selection".

I ask because I have a tough time understanding why there will much of any difference in the compilers today between the two constructs, especially considering comments I recall seeing on the big commercial compiler forum that suggested once compiler optimization is brought to bear, it would make little to no performance difference. The recommendation was to write code that expressed the algorithm and programmer intent and things like that best.

OP says there are about "a hundred keywords" with CASE construct for each and which can grow further over time; imagine the computed GOTO for such a scenario and the chance of coding errors with the labels!?

FortranFan

unread,
Oct 13, 2016, 4:56:01 PM10/13/16
to
On Thursday, October 13, 2016 at 12:33:34 AM UTC-4, Terence wrote:

> ..
>
> I thought it was obvious: the 'simple' programming to locate the routine
> that follows acquiring the keyword index number beyond a small range, has as
> the fastest way, a binary splitting of the range of the index number into
> tests in powers of 2, for sub-range section before actually performing any
> CASE determination, which calls the actual subroutine.
>

@Terence,

Well, please note your explanation is, unfortunately, neither "simple" nor "obvious" to me in the least bit and it has a lot to do with my inadequate background, especially with programming toward highly resource-constrained target computing platforms and environments. But looking at a couple of other comments by Stefano Zaghi and John Campbell, in the particular case of this thread though, it may be not relevant. That is, they have the same concerns and doubts, especially with respect to OP's question, "I am wondering if there is an easier way". So our request to you is earnest: can you show with actual code how your approach helps in this regard?

> ..
>
> In the 'good old days', most Fortran compilers constructed .. an in-line list of actual routine addresses in index order. Perhaps before your time?
>

Yes, indeed. This is also another reason for my request to you to show some actual code. You definitely have succeeded extensively with your approach; others can learn from your vast algorithmic and other coding inventions and experience. If you can document all that in some fashion, perhaps in the form of online blogs or papers or even a book, it will be indeed valuable for the following generations.

> Mind you, I still prefer a computed GOTO over CASE anyway, since this is
> usually compiled as a faster selection.

As I commented in response to the post by FJ, what you state here is contrary to the current 'conventional wisdom' that once optimization kicks in, there should be little to no performance difference between the two constructs. So a developer can then stop worrying about such 'premature optimization' - the kind decried by Knuth - and focus on writing code that 'best' expresses the algorithm, the coding purpose, and so forth. Note the Fortran standard states, "The computed GO TO has been superseded by the SELECT CASE construct, which is a generalized, easier to use, and clearer means of expressing the same computation."

Please keep in mind I'm not asking you to report anything fancy, just code snippets around the need expressed in the original post. If that's too vague, you can even take the simple code below (which should express OP's current approach adequately), modify it as you see fit and share your changes here.

-- begin code --
module procs_m

implicit none

private

public :: s1
public :: s2

contains

subroutine s1( msg )

character(len=*), intent(in) :: msg

print *, "s1 gets the message: ", msg

return

end subroutine s1

subroutine s2( msg )

character(len=*), intent(in) :: msg

print *, "s2 gets the message: ", msg

return

end subroutine s2

end module procs_m

module m

use procs_m, only : s1, s2

implicit none

private

character(len=*), parameter, public :: keywords(*) = [ character(len=2) :: "s1", "s2" ]

public :: s

contains

subroutine s( keyword, msg )

character(len=*), intent(in) :: keyword
character(len=*), intent(in) :: msg

select case ( keyword )

case ( keywords(1) )

call s1( msg )

case ( keywords(2) )

call s2( msg)

case default

print *, "s: invalid keyword of ", keyword
return

end select

return

end subroutine s

end module m

program p

use m, only : s, keywords

implicit none

character(len=:), allocatable :: keyword
character(len=*), parameter :: Message = "Keep it simple!"

keyword = keywords(2)
call s( keyword, Message )

keyword = keywords(1)
call s( keyword, Message )

stop

end program p
-- end code --

Upon execution, the output is (obviously):
s2 gets the message: Keep it simple!
s1 gets the message: Keep it simple!



dpb

unread,
Oct 13, 2016, 6:56:31 PM10/13/16
to
On 10/13/2016 2:46 PM, FortranFan wrote:
> On Thursday, October 13, 2016 at 12:28:02 PM UTC-4, FJ wrote:
>> Le 13/10/2016 à 09:37, Stefano Zaghi a écrit :
>> ..
>>>
>>> Can you provide a concrete benchmark proving that goto is faster than select
>>> case (in the OP scenario)?
>>
>> I did such benchmark few years ago ... and Terence is right !
>>
>> ..
>
> @FJ,
...

> I ask because I have a tough time understanding why there will much
> of any difference in the compilers today between the two constructs,
> especially considering comments I recall seeing on the big commercial
> compiler forum that suggested once compiler optimization is brought to
> bear, it would make little to no performance difference. The
> recommendation was to write code that expressed the algorithm and
> programmer intent and things like that best.
>
> OP says there are about "a hundred keywords" with CASE construct for
> each and which can grow further over time; imagine the computed GOTO
> for such a scenario and the chance of coding errors with the
> labels!?

I'm not compiler writer of any sort and certainly not when it gets to
optimizing the results of a basic translation but I wonder it any
comparison that showed a difference would be related to whether the
lookup selection for the CASE turned out to be a serial series of
comparisons as opposed to a jump table to a precomputed location in the
GOTO? I'm quite possibly all wet here and there's nothing whatever like
that happening, too... :)

herrman...@gmail.com

unread,
Oct 14, 2016, 1:01:31 AM10/14/16
to
On Thursday, October 13, 2016 at 3:56:31 PM UTC-7, dpb wrote:
> On 10/13/2016 2:46 PM, FortranFan wrote:

(snip)

> > I ask because I have a tough time understanding why there will much
> > of any difference in the compilers today between the two constructs,
> > especially considering comments I recall seeing on the big commercial
> > compiler forum that suggested once compiler optimization is brought to
> > bear, it would make little to no performance difference. The
> > recommendation was to write code that expressed the algorithm and
> > programmer intent and things like that best.

(snip)

> I'm not compiler writer of any sort and certainly not when it gets to
> optimizing the results of a basic translation but I wonder it any
> comparison that showed a difference would be related to whether the
> lookup selection for the CASE turned out to be a serial series of
> comparisons as opposed to a jump table to a precomputed location in the
> GOTO? I'm quite possibly all wet here and there's nothing whatever like
> that happening, too... :)

I am not at all sure what compilers now generate for select/case.

One hopes that in the case of sequential CASE values that a jump table
is used instead of a sequence of conditional tests, but you never know
until you try and look at the generated code.

Compilers are getting better all the time, but I don't know
about this one.

Stefano Zaghi

unread,
Oct 14, 2016, 3:39:09 AM10/14/16
to
Dear FJ,

Il giorno giovedì 13 ottobre 2016 18:28:02 UTC+2, FJ ha scritto:
> I did such benchmark few years ago ... and Terence is right !

I do not want that you think there is something personal, but until you provide a concrete benchmark with the possibility to see your code, your sentence like the one of Terence has not any value: provide a benchmark a let us to repeat your test with the compilers available and then we can discuss about the possible speedup.

> Most often, the speedup does not matter. And a "select case" construct is
> easier to extend.
> For normal GOTO, your are right but computed goto is a quit clear construct
> which looks like a select case despite the high number of labels.

Well, a long list (order of O(100) for the OP case) labels is really unreadable. Moreover, what is really worse is the fact that the labels could be placed everywhere, it is up to the programmer being "clean" and put the labels in a clear sequence, but you cannot rely on that. Goto (computed or not) let the algorithm jump everywhere, thus it result into unmaintainable and unreadable code that prevent even further optimization like vectorization and multi-threading. There are good reasons why computed goto are tagged as "obsolescent".

Anyhow, as Richard noted, the OP request for a more easy/clean approach to maintain/improve a long list of procedures called accordingly to some keywords-jumping mechanism, (s)he does not ask how to speedup the calling. As a consequence, I do not understand the Terence suggestion: aside that I still completely do not understand the "simple programming" part of his approach (an example would be very helpful for me to learn a new thing), the part related to the goto is really counterproductive with respect the OP's request: goto is not cleaner with respect select case, it is not more easy extensible than select case, it is not more easy "optimizable" than select case, the only (assumed and not proven) pros is "a possible" faster jump to the correct procedure call and how faster is not even claimed. As I already said, promoting goto nowadays is anachronistic and I hope to have explained my reasons.

My best regards.

dpb

unread,
Oct 14, 2016, 9:54:20 AM10/14/16
to
On 10/14/2016 12:01 AM, herrman...@gmail.com wrote:
...

> I am not at all sure what compilers now generate for select/case.
>
> One hopes that in the case of sequential CASE values that a jump table
> is used instead of a sequence of conditional tests, but you never know
> until you try and look at the generated code.
>
> Compilers are getting better all the time, but I don't know
> about this one.

Well, I'd surely think by now with optimization would do so as well
(generate the jump table, that is); my conjecture was based solely by
inference to the (more-or-less dated?) previous comparison that was
then, at least, slower for some particular test case(s).

I was hypothesizing (the more-or-less obvious I think?) reason for that
to be so then, but possibly or likely not necessarily the case for more
recent compilers presuming things that prevent optimizations aren't
present in the code, anyway.

Gary Scott

unread,
Oct 15, 2016, 12:05:13 PM10/15/16
to
I just did a small test. I TRIED to model a select case with 5 case
selections and I TRIED to design a goto construct that did the same
thing. Probably not exactly. The computed goto was clearly faster in
this test. On average, the select case took 40ns, the computed goto
took 31ns. How would you like to improve this? IVF option /O3
(with /Od, results are 48ns and 36ns)

knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

select case(knt)
case(1)
knt = 2
case(2)
knt = 3
case(3)
knt = 1
case(4)
knt = 5
case(5)
knt = 4
end select

call svTTReadTicks(tTag(i+1))
end do

do i = 1,size(ttag),2
call svTTTicks2Value(ttag(i),ttValue1)
call svTTTicks2Value(ttag(i+1),ttValue2)
write(10,'(f19.12)')ttValue2-ttValue1
end do


knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

goto (1,2,3,4,5), knt

1 knt = 2
go to 10
2 knt = 3
go to 10
3 knt = 1
go to 10
4 knt = 5
go to 10
5 knt = 4

10 call svTTReadTicks(tTag(i+1))
end do
write(10,*)' '
write(10,*)'goto'
do i = 1,size(ttag),2
call svTTTicks2Value(ttag(i),ttValue1)
call svTTTicks2Value(ttag(i+1),ttValue2)
write(10,'(f19.12)')ttValue2-ttValue1
end do

Gary Scott

unread,
Oct 15, 2016, 1:16:02 PM10/15/16
to
Interestingly, this IF construct was nearly identical on average as the
select case above. These 3 tests are performed serially in the same
run. I performed 10 times (10k loop samples each) with same relative
results +/- 3ns. In one run, the if construct was 4ns faster than
select case, certainly a fluke of machine state.


write(10,*)'IF'
knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

if (knt == 1) then
knt = 2
else if (knt == 2) then
knt = 3
else if (knt == 3) then
knt = 1
else if (knt == 4) then
knt = 5
else if (knt == 5) then
knt = 4
end if

herrman...@gmail.com

unread,
Oct 15, 2016, 1:59:17 PM10/15/16
to
On Saturday, October 15, 2016 at 9:05:13 AM UTC-7, Gary Scott wrote:
> On 10/14/2016 8:54 AM, dpb wrote:
> > On 10/14/2016 12:01 AM, herrmanns...@gmail.com wrote:

> >> I am not at all sure what compilers now generate for select/case.

> >> One hopes that in the case of sequential CASE values that a jump table
> >> is used instead of a sequence of conditional tests, but you never know
> >> until you try and look at the generated code.

(snip)

> I just did a small test. I TRIED to model a select case with 5 case
> selections and I TRIED to design a goto construct that did the same
> thing. Probably not exactly. The computed goto was clearly faster in
> this test. On average, the select case took 40ns, the computed goto
> took 31ns. How would you like to improve this? IVF option /O3
> (with /Od, results are 48ns and 36ns)

Just to be sure, could you try one exchanging the select/case and goto
parts, such that the goto was first, and select/case second?

That is, to be sure that it isn't any start up or cache effects?

You could also put both inside a single DO loop, instead of two loops.

There are so many tricks for compilers and processors, it is hard to
know what one might do.

You could also post the generated assembly code for comparison.

thanks,

-- glen

Gary Scott

unread,
Oct 15, 2016, 3:11:13 PM10/15/16
to
Changing the order does seem to equalize the performance except that the
first series regardless of type seems to have more outlier samples
(cache misses)? Reducing the number of samples eliminates the
measurable difference (with this method). I need to modify the sequence
as it is only executing the first few statements though.

I guess it's hard to say there's a difference with this test method.

Gary Scott

unread,
Oct 15, 2016, 3:32:37 PM10/15/16
to
modified to select 1 > 5 > 4 > 3 > 2 > 1...no difference. optimization
on or off, first test always takes slightly longer than the second and
third whether it is an SC an IF or a goto test but appears primarily due
to outlier samples. Most samples are indistinguishable.

Gary Scott

unread,
Oct 15, 2016, 3:58:13 PM10/15/16
to
Ah, "solution", I added a "dummy" test repeating the SC at the top.
After doing this, the average timing for each subsequent type, select
case, computed goto, and if statement are now identical (both /Od
(+/-3ns) and /O3 (+/-1ns), 1000 samples)...hmmm, I had actually planned
to do some real work today...

Suspicious...

write(10,*)' '
write(10,*)'DummySC'
knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

select case(knt)
case(1)
knt = 5
case(2)
knt = 4
case(3)
knt = 3
case(4)
knt = 2
case(5)
knt = 1
end select

call svTTReadTicks(tTag(i+1))
end do

do i = 1,size(ttag)-1,2
call svTTTicks2Value(ttag(i),ttValue1)
call svTTTicks2Value(ttag(i+1),ttValue2)
write(10,'(f19.12)')ttValue2-ttValue1
end do

write(10,*)' '
write(10,*)'SC'
knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

select case(knt)
case(1)
knt = 5
case(2)
knt = 4
case(3)
knt = 3
case(4)
knt = 2
case(5)
knt = 1
end select

call svTTReadTicks(tTag(i+1))
end do

do i = 1,size(ttag)-1,2
call svTTTicks2Value(ttag(i),ttValue1)
call svTTTicks2Value(ttag(i+1),ttValue2)
write(10,'(f19.12)')ttValue2-ttValue1
end do

write(10,*)' '
write(10,*)'goto'
knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

goto (1,2,3,4,5), knt

1 knt = 5
go to 10
2 knt = 4
go to 10
3 knt = 3
go to 10
4 knt = 2
go to 10
5 knt = 1

10 call svTTReadTicks(tTag(i+1))
end do


do i = 1,size(ttag)-1,2
call svTTTicks2Value(ttag(i),ttValue1)
call svTTTicks2Value(ttag(i+1),ttValue2)
write(10,'(f19.12)')ttValue2-ttValue1
end do

write(10,*)' '
write(10,*)'IF'
knt = 1
do i = 1,size(ttag)-1,2
call svTTReadTicks(tTag(i))

if (knt == 1) then
knt = 5
else if (knt == 2) then
knt = 4
else if (knt == 3) then
knt = 3
else if (knt == 4) then
knt = 2
else if (knt == 5) then
knt = 1
end if

call svTTReadTicks(tTag(i+1))
end do

do i = 1,size(ttag)-1,2

Stefano Zaghi

unread,
Oct 15, 2016, 4:03:58 PM10/15/16
to
Dear Gary,

thank you for your test. However, I think that your test has some weakness:

1. you did not provide a full program, thus we cannot reproduce it;
2. there is not details about the architecture you have used;
3. your test seems far from the OP scenario: the work done inside each branch should not be null, on the contrary it seems that is huge compared to the time of the branching itself;
4. your test seems to measure the whole cpu-time of the whole program rather the time of only the branching algorithm (with the inside worker time).

I have done for you another test. Consider that I have "resurrected" my old netbook for you :-)

The results of my tests are reported here https://gist.github.com/szaghi/3b761e66234a90d4815ad30a4d81354c within the complete program, the details of the compiler used and of the PC.

My test indicates that all the three branching flow are comparable and goto is the worse.

I have tried to address what I think your test missed:

1. I provide a complete program;
2. I tried to mimic a varying work-load into each branch;
3. I tried to enter into each branch randomically (pseudo-random at least);
4. I tried to measure only the branching time;

I do not consider much "significative" test, but even your is somehow meaningless. Goto is to be avoided for much important reasons than the (supposed) higher speed of execution.

My best regards.

Gary Scott

unread,
Oct 15, 2016, 4:29:34 PM10/15/16
to
I was only interested in testing the "branch selection" process. I
understand that optimizers may optimize things away. I sampled without
optimization to assess the differences and did not note any substantial
difference.

This is an HP p6210, 8GB, 2.6Ghz, AMD620, WIN10 so IVF may or may not
generate the best code.

No, there are problem/solution domains where performance is imperative.
If GOTO offered a performance advantage, it would be entirely
appropriate to us it. I have not used it in decades, but if I had such
a need, I would not hesitate, if I had verified there was such a benefit
and such a benefit was important to the problem solution.


Gary Scott

unread,
Oct 15, 2016, 4:36:36 PM10/15/16
to
Correction, I do use GOTo for a very small number of cases where error
exit processing otherwise would be cumbersome. It would be possible in
some cases to have a giant bounding do loop, but these become messier to
follow when you are processing 100s of GUI callbacks. It can be easier
and more clear to separate the callback processing from the error exit
processing this way.

FortranFan

unread,
Oct 15, 2016, 6:47:42 PM10/15/16
to
On Saturday, October 15, 2016 at 3:58:13 PM UTC-4, Gary Scott wrote:

> ..
>
> Suspicious...
>
> ..


You need to provide complete details for any meaningful discussion starting with a reproducible case, otherwise I'll suspicious indeed, for it provides me with no takeaways.

FortranFan

unread,
Oct 15, 2016, 7:12:00 PM10/15/16
to
On Saturday, October 15, 2016 at 4:03:58 PM UTC-4, Stefano Zaghi wrote:

> ..
>
> My test indicates that all the three branching flow are comparable and goto is the worse.
>
> I have tried to address what I think your test missed:
>
> 1. I provide a complete program;
> 2. I tried to mimic a varying work-load into each branch;
> 3. I tried to enter into each branch randomically (pseudo-random at least);
> 4. I tried to measure only the branching time;
>
> .. Goto is to be avoided for much important reasons than the (supposed) higher speed of execution.
>
> ..


Stefano,

Great work, you should create a Fortran MYTHBUSTERS collection on GitHub!!
https://en.wikipedia.org/wiki/MythBusters

Fyi, I was thinking somewhat along the same lines and ran a test around the SAME time as your post above - I list the details below which you can review and discard/include in any way you see fit. My read is also the same, there is hardly any discernible difference in CPU performance between the two approaches, but coders may again want to be reminded,

a) SELECT CASE helps in writing clear code that should be easier to understand and maintain,

b) computed GOTO is obsolescent in the Fortran standard, and

c) SELECT CASE works with integer, logical, AND *CHARACTER* scalar expressions whereas computed GOTO only uses scalar numeric expressions that may be converted to integer type.

Readers need to keep point (c) in mind, especially with respect to the original post that showed constructs such as << CASE ('keyword1') >>. A coder would then need to take some extra action - use of some integer tables, perhaps - to utilize computed GOTO statements.

-- begin case --
module mykinds_m

use, intrinsic :: iso_fortran_env, only : I4 => int32, WP => real64

implicit none

real(WP), parameter :: ZERO = 0.0_wp

end module mykinds_m

module procs_m

use mykinds_m, only : WP, ZERO

implicit none

contains

subroutine s1( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s1: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s1

subroutine s2( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s2: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s2

subroutine s3( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s3: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s3

subroutine s4( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s4: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s4

subroutine s5( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s5: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s5

subroutine s6( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s6: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s6

subroutine s7( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s7: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s7

subroutine s8( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s8: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s8

subroutine s9( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s9: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s9

subroutine s10( n, r )

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, "s10: allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

r = norm2( v )

return

end subroutine s10

end module procs_m

module m

use procs_m

implicit none

private

character(len=*), parameter, public :: keywords(*) = [ character(len=3) :: "s1", "s2", "s3", &
"s4", "s5", "s6", "s7", "s8", "s9", "s10" ]
integer, parameter, public :: ikeys(*) = [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]

public :: s_slc
public :: s_cgt

contains

subroutine s_slc( keyword, n, r )

character(len=*), intent(in) :: keyword
integer, intent(in) :: n
real(WP), intent(inout) :: r

select case ( keyword )

case ( keywords(1) )

call s1( n, r )

case ( keywords(2) )

call s2( n, r)

case ( keywords(3) )

call s3( n, r)

case ( keywords(4) )

call s4( n, r)

case ( keywords(5) )

call s5( n, r)

case ( keywords(6) )

call s6( n, r)

case ( keywords(7) )

call s7( n, r)

case ( keywords(8) )

call s8( n, r)

case ( keywords(9) )

call s9( n, r)

case ( keywords(10) )

call s10( n, r)

case default

print *, "s: invarid keyword of ", keyword
return

end select

return

end subroutine s_slc

subroutine s_cgt( ikey, n, r )

integer, intent(in) :: ikey
integer, intent(in) :: n
real(WP), intent(inout) :: r

goto (1,2,3,4,5,6,7,8,9,10), ikey

1 continue
call s1( n, r )
go to 99

2 continue
call s2( n, r )
go to 99

3 continue
call s3( n, r )
go to 99

4 continue
call s4( n, r )
go to 99

5 continue
call s5( n, r )
go to 99

6 continue
call s6( n, r )
go to 99

7 continue
call s8( n, r )
go to 99

8 continue
call s8( n, r )
go to 99

9 continue
call s9( n, r )
go to 99

10 continue
call s10( n, r )
go to 99 ! wonder if compiler optimization eliminates this

99 continue
return

end subroutine s_cgt

end module m
program p

use mykinds_m, only : I4, WP, ZERO
use m, only : s_slc, s_cgt, keywords

implicit none

!..
integer, parameter :: MAXREPEAT = 10
integer, parameter :: MAXTRIAL = 2**10
integer :: Idx(MAXTRIAL)
integer :: Counter
integer :: j
real(WP) :: Start_Time = ZERO
real(WP) :: End_Time = ZERO
real(WP) :: Ave_Time = ZERO
real(WP) :: CpuTimes_SLC(MAXREPEAT)
real(WP) :: CpuTimes_CGT(MAXREPEAT)
real(WP) :: r(MAXTRIAL)
real(WP) :: x_norm
character(len=*), parameter :: FMT_CPU = "(a, t40, g0, a)"

print *, "Mythbuster #1: SELECT CASE vs COMPUTED GOTO" // new_line("")

CpuTimes_SLC = ZERO
CpuTimes_CGT = ZERO

print *, "SELECT CASE:"
Loop_Repeat_Slc: do Counter = 1, MAXREPEAT

print *, " Trial ", Counter

call random_number(r)
Idx = int( r*10.0_wp, kind=kind(Idx) ) + 1

!..
call my_cpu_time(Start_Time)

do j = 1, MAXTRIAL

call s_slc( keywords( Idx(j) ), Idx(j), x_norm )

end do

call my_cpu_time(End_Time)

CpuTimes_SLC(Counter) = (End_Time - Start_Time)

write(*, fmt=FMT_CPU) " CPU Time: ", CpuTimes_SLC(Counter), " seconds."

end do Loop_Repeat_Slc

print *, "COMPUTED GOTO:"
Loop_Repeat_Cgt: do Counter = 1, MAXREPEAT

print *, " Trial ", Counter

call random_number(r)
Idx = int( r*10.0_wp, kind=kind(Idx) ) + 1

!..
call my_cpu_time(Start_Time)

do j = 1, MAXTRIAL

call s_cgt( Idx(j), Idx(j), x_norm )

end do

call my_cpu_time(End_Time)

CpuTimes_CGT(Counter) = (End_Time - Start_Time)

write(*, fmt=FMT_CPU) " CPU Time: ", CpuTimes_CGT(Counter), " seconds."

end do Loop_Repeat_Cgt


!.. Average CPU time: exclude highest and lowest values
Ave_Time = sum(CpuTimes_SLC)
Ave_Time = Ave_Time - maxval(CpuTimes_SLC) - minval(CpuTimes_SLC)
Ave_Time = Ave_Time/real(MAXREPEAT-2, kind=WP)
write(*, fmt=FMT_CPU) "SELECT CASE: Average CPU Time ", Ave_Time," seconds."

!.. Average CPU time: exclude highest and lowest values
Ave_Time = sum(CpuTimes_CGT)
Ave_Time = Ave_Time - maxval(CpuTimes_CGT) - minval(CpuTimes_CGT)
Ave_Time = Ave_Time/real(MAXREPEAT-2, kind=WP)
write(*, fmt=FMT_CPU) "COMPUTED GOTO: Average CPU Time ", Ave_Time," seconds."

!..
stop

contains

subroutine my_cpu_time( time )

!.. Argument list
real(WP), intent(inout) :: time

!.. Local variables
integer(I4) :: tick
integer(I4) :: rate

call system_clock (tick, rate)

time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) )

return

end subroutine my_cpu_time

end program p
-- end case --

Upon execution with Intel Fortran with /O2 on a Windows 7 laptop, Intel i5 CPU 2.7 GHz, 8 GB machine,

-- begin results --
Mythbuster #1: SELECT CASE vs COMPUTED GOTO

SELECT CASE:
Trial 1
CPU Time: .9999999892897904E-03 seconds.
Trial 2
CPU Time: .1999999978579581E-02 seconds.
Trial 3
CPU Time: .9999999892897904E-03 seconds.
Trial 4
CPU Time: .1999999978579581E-02 seconds.
Trial 5
CPU Time: .2000000007683411E-02 seconds.
Trial 6
CPU Time: .2000000007683411E-02 seconds.
Trial 7
CPU Time: .1000000018393621E-02 seconds.
Trial 8
CPU Time: .9999999892897904E-03 seconds.
Trial 9
CPU Time: .1999999978579581E-02 seconds.
Trial 10
CPU Time: .2000000007683411E-02 seconds.
COMPUTED GOTO:
Trial 1
CPU Time: .1000000018393621E-02 seconds.
Trial 2
CPU Time: .2000000007683411E-02 seconds.
Trial 3
CPU Time: .2000000007683411E-02 seconds.
Trial 4
CPU Time: .1000000018393621E-02 seconds.
Trial 5
CPU Time: .9999999892897904E-03 seconds.
Trial 6
CPU Time: .1000000018393621E-02 seconds.
Trial 7
CPU Time: .2000000007683411E-02 seconds.
Trial 8
CPU Time: .1999999978579581E-02 seconds.
Trial 9
CPU Time: .2000000007683411E-02 seconds.
Trial 10
CPU Time: .2000000007683411E-02 seconds.

SELECT CASE: Average CPU Time .1624999993509846E-02 seconds.
COMPUTED GOTO: Average CPU Time .1625000008061761E-02 seconds.
-- end results --

Basically all that any CPU time measurement indicates is some statistical mean of the keyword-processor subroutine.

dpb

unread,
Oct 15, 2016, 7:59:40 PM10/15/16
to
On 10/15/2016 3:03 PM, Stefano Zaghi wrote:
...

> My test indicates that all the three branching flow are comparable and goto is the worse.
...

It'd be most informative to see what the code generator produced for the
switch logic between the three to account for those results.

Gary Scott

unread,
Oct 15, 2016, 8:18:18 PM10/15/16
to
my test indicates that the order of the 3 tests can impact the results
but that I can create a test where all 3 in sequence take almost exactly
the same time with or without optimization. In any event, the
differences are very small and not likely to be of importance in most cases.

campbel...@gmail.com

unread,
Oct 15, 2016, 9:13:23 PM10/15/16
to
On Sunday, October 16, 2016 at 10:12:00 AM UTC+11, FortranFan wrote:
> On Saturday, October 15, 2016 at 4:03:58 PM UTC-4, Stefano Zaghi wrote:
>
>
> use, intrinsic :: iso_fortran_env, only : I4 => int32, WP => real64
>
> subroutine my_cpu_time( time )
>
> !.. Argument list
> real(WP), intent(inout) :: time
>
> !.. Local variables
> integer(I4) :: tick
> integer(I4) :: rate
>
> call system_clock (tick, rate)
>
> time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) )
>
> return
>
> end subroutine my_cpu_time

I could not identify the version of the compiler and OS you are using but please re-do the test with I8 => int64 ; integer(I8) :: tick

There are lots of SYSTEM_CLOCK implementations using int32 that don't give adequate precision.

A computed GOTO (...), ikey is always going to be faster than SELECT CASE as CASE has much more general values, but..
1) It is a lot more difficult to maintain/extend code based on GOTO
2) for comparison, you should include the code to scan for ikey
3) What does it matter, when the keyword is read from a file.

Ian Harvey

unread,
Oct 15, 2016, 9:46:14 PM10/15/16
to
Thanks for the example code.

Note that the case construct and if construct options are not equivalent
to the computed goto option.

- When keyword has the value zero (which was perhaps not intentional -
maybe `keyword = int(random*3, int32) + 1` ?) the goto option does work,
while the other tests do not.

- When keyword has the value one, the goto option executes all three
worker subroutines, and when it has the value two, it executes both
`worker2` and `worker3`.

If I change the computed goto option to:

call system_clock(profiling(1), count_rate)
goto (10, 20, 30), keyword
goto 40
10 call worker1(key=keyword, array=key_work) ; goto 40
20 call worker2(key=keyword, array=key_work) ; goto 40
30 call worker3(key=keyword, array=key_work) ; goto 40
40 continue
call system_clock(profiling(2), count_rate)

then I see precious little difference in the results.

From looking at the generated assembly with ifort 17.0's default
command line optimisation on Windows x46, all options are implemented
using straightforward cmp and jxx instructions in pretty much the exact
same way, including the same nominal compiler branch selection
probabilities.

I personally find the case construct and if construct examples far
easier to read. Their use is demonstrably far less error prone too ;).

dpb

unread,
Oct 15, 2016, 9:58:37 PM10/15/16
to
On 10/15/2016 8:45 PM, Ian Harvey wrote:
...

> From looking at the generated assembly with ifort 17.0's default
> command line optimisation on Windows x46, all options are implemented
> using straightforward cmp and jxx instructions in pretty much the exact
> same way, including the same nominal compiler branch selection
> probabilities.
...

That's certainly what I'd expected, and hence that the results will be
also be the same...

Stefano Zaghi

unread,
Oct 16, 2016, 1:27:25 AM10/16/16
to
Dear Gary,

feel free to think whatever you want, but remain the fact that:

1. you did not provide the whole test, thus your is not reproducible => meaningless;
2. you did not try to mimic OP scenario => meaningless;
3. when the performance are "imperative" (I do a lot of HPC for your knowledge) goto are the evil for multi-thread/vectorization (and you compile with optimizations enabled) => your conclusions are wrong.

Dear dpb, I provided the full program, you can check what the optimizer generates, but as Ian checked the Intel optimizer do a great job.

Dear Ian, thank you very much: last night I suspect there was something wrong, but I was too tired to check, my knowledge of goto is near to zero. This is the reasons to public the full tests. I'll update the gist very soon, thank you very much!

Dear FortranFan, you too much kind, but I am not up to "bust" nothing :-) I'll add you test to my gist soon.

My best regards.

Stefano Zaghi

unread,
Oct 16, 2016, 2:02:51 AM10/16/16
to
Dear all,

I have amended the test with Ian's correction (I also changed integer(int32) => integer(int64) for the tic-toc profiling as other suggested). The amended code is here https://gist.github.com/szaghi/3b761e66234a90d4815ad30a4d81354c#benchmark-program

The results is essentially unchanged, see this https://gist.github.com/szaghi/3b761e66234a90d4815ad30a4d81354c#average-performances and this https://gist.github.com/szaghi/3b761e66234a90d4815ad30a4d81354c#benchmarks-output

The 3 branching models behave almost identically.

I added the wise observations of FortranFan here https://gist.github.com/szaghi/3b761e66234a90d4815ad30a4d81354c#conclusions

Please, note the my conclusions are not as sharp as others: "for my test case goto SEEMS to not provide any speedup" is not an absolute sentence like "goto is always faster then select case" that is evidently false, whereas it is true that goto is obsolescent, produces spaghetti-code, is not clear, is not maintainable, prevents HPC exploitation.

My best regards.

James Van Buskirk

unread,
Oct 16, 2016, 2:32:44 AM10/16/16
to
"Ian Harvey" wrote in message news:ntum4h$tto$1...@dont-email.me...

> From looking at the generated assembly with ifort 17.0's default command
> line optimisation on Windows x46, all options are implemented using
> straightforward cmp and jxx instructions in pretty much the exact same
> way, including the same nominal compiler branch selection probabilities.

What I usually do to check compiler output is to put the code in
question in a minimal procedure so that the results are easier to
interpret. Minimal compilable example, what a concept!

D:\gfortran\clf\selecttest>type select.f90
subroutine sub(knt)
implicit none
integer knt
select case(knt)
case(1)
knt = 2
case(2)
knt = 3
case(3)
knt = 1
case(4)
knt = 5
case(5)
knt = 4
end select
end subroutine sub

And an excerpt from the assembly output. Note that I used /O3 as in the
O.P.
All three version indeed seem to do the same thing: first the argument knt
is
loaded into rax and and checked whether it's in range. Then it's moved to
rdx and an address at fixed offset from the jump table is loaded into rax,
after which the result is obtained in rcx and moved into the argument
memory.

D:\gfortran\clf\selecttest>ifort /O3 /c /FA select.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on
Intel(R
) 64, Version 16.0.2.180 Build 20160204
Copyright (C) 1985-2016 Intel Corporation. All rights reserved.

I used
SUB PROC
; parameter 1: rcx
.B1.1:: ; Preds .B1.0
L1::
;1.12
mov r8, rcx ;1.12
movsxd rax, DWORD PTR [r8] ;4.20
dec rax ;4.20
cmp rax, 4 ;4.20
ja .B1.3 ; Prob 50% ;4.20
; LOE rax rbx rbp rsi rdi r8 r12 r13 r14 r15
xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.2:: ; Preds .B1.1
mov edx, eax ;4.20
lea rax, QWORD PTR [__ImageBase] ;4.20
mov ecx, DWORD PTR
[imagerel(.2.2_2.switchtab.0.0.1)+rax+rdx*4] ;4.20
mov DWORD PTR [r8], ecx ;6.14
; LOE rbx rbp rsi rdi r12 r13 r14 r15 xmm6
xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.3:: ; Preds .B1.1 .B1.2
ret ;16.1
ALIGN 16
; LOE
.B1.4::
; mark_end;
SUB ENDP

Here is the jump table:

.2.2_2.switchtab.0.0.1 DD 2
DD 3
DD 1
DD 5
DD 4

D:\gfortran\clf\selecttest>type goto.f90
subroutine sub(knt)
implicit none
integer knt
goto (1,2,3,4,5), knt

1 knt = 2
go to 10
2 knt = 3
go to 10
3 knt = 1
go to 10
4 knt = 5
go to 10
5 knt = 4

10 continue
end subroutine sub

D:\gfortran\clf\selecttest>ifort /O3 /c /FA goto.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on
Intel(R
) 64, Version 16.0.2.180 Build 20160204
Copyright (C) 1985-2016 Intel Corporation. All rights reserved.

Again, the assembly code and jump table:

SUB PROC
; parameter 1: rcx
.B1.1:: ; Preds .B1.0
L1::
;1.12
mov eax, DWORD PTR [rcx] ;4.8
dec eax ;4.8
cmp eax, 4 ;4.8
ja .B1.3 ; Prob 50% ;4.8
; LOE rax rcx rbx rbp rsi rdi r12 r13 r14
r15 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.2:: ; Preds .B1.1
lea rdx, QWORD PTR [__ImageBase] ;4.8
mov eax, DWORD PTR
[imagerel(.2.2_2.switchtab.0.0.1)+rdx+rax*4] ;4.8
jmp .B1.4 ; Prob 100% ;4.8
; LOE rcx rbx rbp rsi rdi r12 r13 r14 r15
eax xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.3:: ; Preds .B1.1
mov eax, 2 ;6.14
; LOE rcx rbx rbp rsi rdi r12 r13 r14 r15
eax xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.4:: ; Preds .B1.2 .B1.3
mov DWORD PTR [rcx], eax ;6.14
ret ;17.1
ALIGN 16
; LOE
.B1.5::
; mark_end;
SUB ENDP

.2.2_2.switchtab.0.0.1 DD 2
DD 3
DD 1
DD 5
DD 4

D:\gfortran\clf\selecttest>type if.f90
subroutine sub(knt)
implicit none
integer knt
if (knt == 1) then
knt = 2
else if (knt == 2) then
knt = 3
else if (knt == 3) then
knt = 1
else if (knt == 4) then
knt = 5
else if (knt == 5) then
knt = 4
end if
end subroutine sub

D:\gfortran\clf\selecttest>ifort /O3 /c /FA if.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on
Intel(R
) 64, Version 16.0.2.180 Build 20160204
Copyright (C) 1985-2016 Intel Corporation. All rights reserved.

And the assembly code and jump table:

SUB PROC
; parameter 1: rcx
.B1.1:: ; Preds .B1.0
L1::
;1.12
mov r8, rcx ;1.12
mov eax, DWORD PTR [r8] ;4.8
dec eax ;4.16
cmp eax, 4 ;4.16
ja .B1.3 ; Prob 50% ;4.16
; LOE rax rbx rbp rsi rdi r8 r12 r13 r14 r15
xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.2:: ; Preds .B1.1
lea rdx, QWORD PTR [__ImageBase] ;4.16
mov ecx, DWORD PTR
[imagerel(.2.2_2.switchtab.0.0.1)+rdx+rax*4] ;4.16
mov DWORD PTR [r8], ecx ;5.14
; LOE rbx rbp rsi rdi r12 r13 r14 r15 xmm6
xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
.B1.3:: ; Preds .B1.1 .B1.2
ret ;15.1
ALIGN 16
; LOE
.B1.4::
; mark_end;
SUB ENDP

.2.2_2.switchtab.0.0.1 DD 2
DD 3
DD 1
DD 5
DD 4

They all look pretty similar to me, so I would surmise that any differences
in timing are artifacts of the timing methodology.

dpb

unread,
Oct 16, 2016, 10:10:09 AM10/16/16
to
On 10/16/2016 12:27 AM, Stefano Zaghi wrote:
...

> Dear dpb, I provided the full program, you can check what the
> optimizer generates, but as Ian checked the Intel optimizer do a
> great job.

...

Chill, dood! :)

Not your manhood being called into question here...just the likelihood
of what a modern optimizer will do with a couple or three different
constructions.

I don't have a recent (Fortran) compiler installed any longer; hence
it's not at all simple for me to generate realistic code to compare. I
was simply conjecturing that _IF_ a given test showed a particular bias
at some point in the past, _perhaps_ that indicated either that compiler
didn't optimize as well (yet) or the options to do so weren't in force
or somesuch. As James Van B posted, in isolation the code generated the
same machine instructions for each of three options (CASE, GOTO, IF).
That being the case (so to speak :) ), the timing must be the same for
that piece of code and so as he notes, anything that shows up
differently must be from other sources.

That there's advantage in source form for SELECT CASE in terms of human
legibility and extension isn't being argued too terribly much altho it's
also quite possible to write very well structured code using GOTO if one
uses discipline in doing so; particularly for such structures as
outlined here; having a common exit makes it pretty simple to lay out
the structure in linear fashion. Where spaghetti code arises is when
there's a much more convoluted logic path imposed in addition so that
execution reverts to an earlier section again or in implementations
where discipline was not maintained and code is scattered around more or
less willy-nilly. This certainly has happened and can happen again, but
isn't _necessarily_ the result of the judicious use of GOTO.

That it would be the first choice I'll grant is somewhat anachronistic,
yes, but recall that the poster of the suggestion is of an age that long
predates the alternate constructions into the language (or much of _any_
language, for that matter) and also based on his postings over the
years, his recent endeavors have continued to use older-vintage machines
and tools that don't necessarily have all the features or optimizations
of modern compilers so that in fact, he may have the case where the GOTO
option _is_ still actually faster because in that particular compiler
the optimization of the later construct may not have yet occurred.
Again, I don't know this, I have that compiler still on another old
machine here but it would take some significant effort to drag it out
and get the necessary peripherals hooked up to test and it's not worth
the effort to do so...

Gary Scott

unread,
Oct 16, 2016, 4:04:24 PM10/16/16
to
On 10/16/2016 12:27 AM, Stefano Zaghi wrote:
I replied with the complete test code with exception of
declarations...the ttvalues are 64 bit reals, the array of clock ticks
are 64 bit integers. The call procedures are wrappers to intel MKL
procedures. They will be obvious. The wrappers do little but provide
some additional "time tag reset" features and features to synchronize to
the real time clock provided by date_and_time. I use that to help
ensure consistency and identify sample glitches caused by OS paging or
cache misses or whatever other interruptions there might be.

In any event, my soffit repair is more important task for this weekend,
before it gets cold out.

I'm fairly convinced that in most cases, the goto will perform nearly
identically to select case/if. I was able to identify a scenario where
it was faster and scenarios where it was slower by roughly 7 ns. That's
in the noise and not consistent, so probably caused by optimizer
oddities or cache behavior differences or you name it.

I did not intend to do anything but toy with this until I was able to
get to work on my repair. I do think that there is a selection behavior
with goto that causes it to occasionally take extra time to select a
path. While most path selections were identical to select case or if,
there was a rare, fairly random case where rather than ~38ns, it took on
the order of 1us. This difference was fairly consistent with computed
goto loop and was never seen with select case or if.

Stefano Zaghi

unread,
Oct 16, 2016, 4:22:12 PM10/16/16
to
Dear dpb,

Il giorno domenica 16 ottobre 2016 16:10:09 UTC+2, dpb ha scritto:
> Chill, dood! :)

Sorry, my English is very limited, slang is not covered at all. I argue this should be something "keep it easy, keep calm...", if so, I am really a piece of ice :-)

> Not your manhood being called into question here...just the likelihood
> of what a modern optimizer will do with a couple or three different
> constructions.

My "manhood" is a matter of my wife. I simply observed that I have provided the full test thus anyone can stop to guess and suppose and conjecture and try his/her compiler of preference, as Ian and James already done. I am really calm, and you?

> I don't have a recent (Fortran) compiler installed any longer; hence
> it's not at all simple for me to generate realistic code to compare. I
> was simply conjecturing that _IF_ a given test showed a particular bias
> at some point in the past, _perhaps_ that indicated either that compiler
> didn't optimize as well (yet) or the options to do so weren't in force
> or somesuch. As James Van B posted, in isolation the code generated the
> same machine instructions for each of three options (CASE, GOTO, IF).
> That being the case (so to speak :) ), the timing must be the same for
> that piece of code and so as he notes, anything that shows up
> differently must be from other sources.

I agree, and I was of the same idea even before the test. My test produces the same timing for all branching, it is the Gary's test that seems to show different bias, but we cannot reproduce it.

> That there's advantage in source form for SELECT CASE in terms of human
> legibility and extension isn't being argued too terribly much altho it's
> also quite possible to write very well structured code using GOTO if one
> uses discipline in doing so; particularly for such structures as
> outlined here; having a common exit makes it pretty simple to lay out
> the structure in linear fashion.

I disagree. Even the very simple goto of Ian is more complicated than the equivalent select case or if elseif construcuts: Ian was very disciplined, but the goto remains unreadable.

> Where spaghetti code arises is when
> there's a much more convoluted logic path imposed in addition so that
> execution reverts to an earlier section again or in implementations
> where discipline was not maintained and code is scattered around more or
> less willy-nilly. This certainly has happened and can happen again, but
> isn't _necessarily_ the result of the judicious use of GOTO.

I disagree. Goto helps just to jump everywhere, disciplined branching can be done with other constructs more sane. Teaching to new coders to use goto is really a bad teaching.

> That it would be the first choice I'll grant is somewhat anachronistic,
> yes, but recall that the poster of the suggestion is of an age that long
> predates the alternate constructions into the language (or much of _any_
> language, for that matter) and also based on his postings over the
> years, his recent endeavors have continued to use older-vintage machines
> and tools that don't necessarily have all the features or optimizations
> of modern compilers so that in fact, he may have the case where the GOTO
> option _is_ still actually faster because in that particular compiler
> the optimization of the later construct may not have yet occurred.

I think I miss the most part of this sentence. However, it seems to me that you are trying to cover all cases, it is just to trying to "climb over a mirror": yes, if you are running an 90' x8086 with f77 sure goto is for you. We are now in 2016 (close to 2017) and hopefully select case and if elseif are well implemented into almost all compilers. Thus, nope, goto nowadays does not offer anything more than select case and if eleif, it is not faster, it is only more error-prone. It is very curios that into the Van Snyder's thread you claimed that my approach flavors "snappy coders" while now all coders are so disciplined that goto is a reliable approach... funny.

> Again, I don't know this, I have that compiler still on another old
> machine here but it would take some significant effort to drag it out
> and get the necessary peripherals hooked up to test and it's not worth
> the effort to do so...

Ian and James shown that the Intel and GNU do a good job for all the 3 branching models. If you need other tests I can try to complete them, if I'll be up to the task.

My best regards.

campbel...@gmail.com

unread,
Oct 16, 2016, 8:51:29 PM10/16/16
to
On Monday, October 17, 2016 at 7:22:12 AM UTC+11, Stefano Zaghi wrote:
>
> I disagree. Even the very simple goto of Ian is more complicated than the equivalent select case or if elseif construcuts: Ian was very disciplined, but the goto remains unreadable.
>
I see no difference between SELECT CASE and GOTO coding structure, and I would suggest that the compiler would make the GOTO a worse case. All this adverse talk about GOTO is unnecessary bias. I find the following easy to read.

subroutine s_cgt( ikey, n, r )

integer, intent(in) :: ikey
integer, intent(in) :: n
real(WP), intent(inout) :: r

goto (1,2,3,4,5,6,7,8,9,10), ikey

call s_error (ikey)
go to 99

1 call s1( n, r )
go to 99

2 call s2( n, r )
go to 99

3 call s3( n, r )
go to 99

4 call s4( n, r )
go to 99

5 call s5( n, r )
go to 99

6 call s6( n, r )
go to 99

7 call s8( n, r )
go to 99

8 call s8( n, r )
go to 99

9 call s9( n, r )
go to 99

10 call s10( n, r )
go to 99

99 return

end subroutine s_cgt

campbel...@gmail.com

unread,
Oct 16, 2016, 8:54:38 PM10/16/16
to
On Monday, October 17, 2016 at 11:51:29 AM UTC+11, campbel...@gmail.com wrote:
> I see no difference between SELECT CASE and GOTO coding structure, and I would suggest that the compiler would make the GOTO a worse case. All this adverse talk about GOTO is unnecessary bias. I find the following easy to read.
>
Why can't we edit posts !!

I meant to say : that the compiler would NOT make the GOTO a worse case

Ian Harvey

unread,
Oct 16, 2016, 11:12:46 PM10/16/16
to
On 2016-10-17 11:51, campbel...@gmail.com wrote:
> On Monday, October 17, 2016 at 7:22:12 AM UTC+11, Stefano Zaghi
> wrote:
>>
>> I disagree. Even the very simple goto of Ian is more complicated
>> than the equivalent select case or if elseif construcuts: Ian was
>> very disciplined, but the goto remains unreadable.
>>
> I see no difference between SELECT CASE and GOTO coding structure,
> and I would suggest that the compiler would make the GOTO a worse
> case. All this adverse talk about GOTO is unnecessary bias. I find
> the following easy to read.
>
> subroutine s_cgt( ikey, n, r )
>
> integer, intent(in) :: ikey integer, intent(in) ::
> n real(WP), intent(inout) :: r
>
> goto (1,2,3,4,5,6,7,8,9,10), ikey

I read the above statement, and it tells me that execution is going
somewhere, based on the value of ikey. It tells me nothing about why
execution is branching, or the nature of the branch.

(You knew when you wrote that statement the why and what, but will the
person who has to maintain this code (even if it is still you), know
tomorrow?)

On the other hand,

select case (ikey)

tells me that different _blocks of code_ are about to about to be
executed, based on the value of ikey.

The semantics of SELECT CASE are a better fit to the story that the
source code is trying to tell.

(This is the same reasoning I use in preferring a case construct over
sequential if constructs for this scenario - I read "select case", I
know I am selecting a case based on a single value, while "if ..." just
tells me conditional execution - I need to look at all the subsequent
tests to work out what is going on.)

Similar clarity of code comments apply to the notional blocks below.
For example, the delineation of the blocks isn't anywhere as distinct as
the explicit block syntax of select case - you need to read ahead to
figure out that any one goto to label 99 effectively means "leave the
construct", you need to confirm that goto is present in all blocks, and
you may need to consider the possibility of jumps into the construct
from elsewhere. I also find the labelling of the condition that
triggers the entry into the block far more indirect - if two such
constructs are required in the same inclusive scope then some sort of
label offset will be required (be careful to not mix up the sets),
labels in the list are not required to be unique, if you happen to miss
a label in the list the compiler isn't going to care, and the same block
of code can be addressed by multiple labels.

It is far clearer to have the structural blocks of code delineated by
structural syntax, rather than them being inferred.

Why didn't s7 get some love?

Stefano Zaghi

unread,
Oct 16, 2016, 11:35:37 PM10/16/16
to
Dear Campbel,
We are passed to say "GOTO is always compiled into a faster branching" to "the compiler would NOT make the goto a worse case"... funny.

I simply replied to whom suggested to use goto instead select case because goto is faster: this is a false myth (that could be true in old-good days that select case was born, but not now, and we are leaving now not in the past), and there are many good things to avoid goto.

You could think that goto is wonderful, but it is your reality, I think it does not, so for me it is necessary to point it out, really it is not uncessary talk. As Ian as replied, goto is not really readable, it is error-prone, it is spaghetti-code-prone, it prevents multi-treading, it is an obsolescent feature.

My best regards.

Stefano Zaghi

unread,
Oct 17, 2016, 12:11:04 AM10/17/16
to
Dear Gary,

Il giorno domenica 16 ottobre 2016 22:04:24 UTC+2, Gary Scott ha scritto:
> I replied with the complete test code with exception of
> declarations...the ttvalues are 64 bit reals, the array of clock ticks
> are 64 bit integers. The call procedures are wrappers to intel MKL
> procedures. They will be obvious. The wrappers do little but provide
> some additional "time tag reset" features and features to synchronize to
> the real time clock provided by date_and_time. I use that to help
> ensure consistency and identify sample glitches caused by OS paging or
> cache misses or whatever other interruptions there might be.

I am sorry, I missed the whole program, my bad, mea culpa. Anyhow, you have used almost the same number of lines to describe the "exceptions or missed lines" as the lines of statement you have actually posted... can you re-post "all" the program for me thus I can easily (cut/paste) reproduce "exactly" your test? I am sorry, but the interpretation of your "exceptions description" could let me to generate a different test.

> In any event, my soffit repair is more important task for this weekend,
> before it gets cold out.

Good luck for your "soffit".

> I'm fairly convinced that in most cases, the goto will perform nearly
> identically to select case/if. I was able to identify a scenario where
> it was faster and scenarios where it was slower by roughly 7 ns.

Well, until I can reproduce exactly your test, I suspect that you have identified nothing. 7ns respect what?

> That's
> in the noise and not consistent, so probably caused by optimizer
> oddities or cache behavior differences or you name it.

Probably, I agree.

> I did not intend to do anything but toy with this until I was able to
> get to work on my repair. I do think that there is a selection behavior
> with goto that causes it to occasionally take extra time to select a
> path. While most path selections were identical to select case or if,
> there was a rare, fairly random case where rather than ~38ns, it took on
> the order of 1us. This difference was fairly consistent with computed
> goto loop and was never seen with select case or if.

I am sorry, but I am going to be lost. You have stated that goto plays a role when "performance is imperative", now it seems the contrary. Probably I have missed some new results arising from your test. Does, in your test(s), goto taking extra time? Indeed, I would bet that the 3 branching models have the same performance (in my opinion, talking about 38ns or similar measures is meaningless if not accurately contextualized).

If you agree that there is not a relevant speedup in the usage of goto, why one should use it nowadays?

My best regards.

FortranFan

unread,
Oct 17, 2016, 1:15:35 AM10/17/16
to
On Sunday, October 16, 2016 at 11:12:46 PM UTC-4, Ian Harvey wrote:

> On 2016-10-17 11:51, campbelljohn wrote:
> .. I find
> > the following easy to read.
> >
> > subroutine s_cgt( ikey, n, r )
> >
> > integer, intent(in) :: ikey integer, intent(in) ::
> > n real(WP), intent(inout) :: r
> >
> > goto (1,2,3,4,5,6,7,8,9,10), ikey
>
> .. You knew when you wrote that statement the why and what, but will the
> person who has to maintain this code (even if it is still you), know
> tomorrow?..
>

Another point to keep in mind is one I have mentioned several times before on this forum: in our experience, the few young engineers and scientists who are coming to Fortran often have backgrounds in other coding approaches such as MATLAB, Visual Basic, C/C++ etc. That is, Fortran is often NOT their first choice for a programming language. These other languages have "switch" and "Select Case" constructs which do have some *commonality* with "select case" in Fortran; please note I'm not trying to ignore the differences between these languages themselves or with Fortran (thus there is no need to further diverge the thread on this account), I'm simply bringing up aspects that help young coders pick up a new language more easily: SELECT CASE in Fortran happens to be one of them. In fact, I do recall a project where a young engineer in mid-20s in the team had lots of issues fixing and extending code that made extensive use of computed GOTOs: this person with good background in MATLAB and Visual Basic .NET simply couldn't relate to computed GOTO. The code in this project is one of the few still left with older style in this organization i.e., fixed format with constructs ranging from FORTRAN IV to 66/77 with some DEC extensions mainly due to the approach by a 3rd party simulation package that is involved, one that Ian Harvey seems to have experience with, one whose first name is a popular ski area in western US!

https://www.mathworks.com/help/matlab/ref/switch.html?requestedDomain=www.mathworks.com
https://msdn.microsoft.com/en-us/library/cy37t14y.aspx


> Why didn't s7 get some love?
>
> ..

Good point! Looks like I snuck in the vicious discrimination against this poor sub that went under John Campbell's radar.

FortranFan

unread,
Oct 17, 2016, 1:44:01 AM10/17/16
to
On Sunday, October 16, 2016 at 10:10:09 AM UTC-4, dpb wrote:

> .. recall that the poster of the suggestion is of an age that long
> predates the alternate constructions into the language (or much of _any_
> language, for that matter) and also based on his postings over the
> years, his recent endeavors have continued to use older-vintage machines
> and tools that don't necessarily have all the features or optimizations
> of modern compilers so that in fact, he may have the case where the GOTO
> option _is_ still actually faster because in that particular compiler
> the optimization of the later construct may not have yet occurred.
> Again, I don't know this ..


@dpb,

Look upthread - several requests were made to said poster to provide some details but there has been no response yet. And notice below further information and request is extended to this enlighten us.

@Terence/FJ,
If you are still following this, it will be helpful if you can explain the provide some actionable evidence to back up your comment, "I still prefer a computed GOTO over CASE anyway, since this is usually compiled as a faster selection." and "I did such benchmark few years ago ... and Terence is right !"

Terence, I do recall reading you use Compaq Visual Fortran (published circa 2001?) on some of your "older-vintage machines" and if so, please note I have tried my test case with Compaq Visual Fortran 6.6a and again I do NOT see any performance difference with /O2 optimization option. You can take the test code and report what you see. Note the CPU performance timing results are mainly the CPU demand of the keyword-processing subroutines - if these procedures are trivial, chances are in actual code users will fail to notice anything but if the procedures are very CPU intensive, the use of computed GOTO is not going to help any - it will be the kind of "premature optimization" Knuth decried. So please explain.

N.B.: do NOT define a preprocessor directive of IFORT if you are using Compaq Visual Fortran; the code below uses such a directive to work with the two compilers.

Upon execution on Windows 7 OS, Intel i5 CPU, 2.7 GHz with 8 GB RAM:
-- begin output --
Mythbuster #1: SELECT CASE vs COMPUTED GOTO

SELECT CASE:
Trial 1
CPU Time: 3.999948501586914E-03 seconds.
Trial 2
CPU Time: 5.000114440917969E-03 seconds.
Trial 3
CPU Time: 3.999948501586914E-03 seconds.
Trial 4
CPU Time: 3.000020980834961E-03 seconds.
Trial 5
CPU Time: 4.000186920166016E-03 seconds.
Trial 6
CPU Time: 3.999948501586914E-03 seconds.
Trial 7
CPU Time: 3.999948501586914E-03 seconds.
Trial 8
CPU Time: 4.000186920166016E-03 seconds.
Trial 9
CPU Time: 3.999948501586914E-03 seconds.
Trial 10
CPU Time: 3.000020980834961E-03 seconds.
COMPUTED GOTO:
Trial 1
CPU Time: 4.000186920166016E-03 seconds.
Trial 2
CPU Time: 3.999948501586914E-03 seconds.
Trial 3
CPU Time: 3.999948501586914E-03 seconds.
Trial 4
CPU Time: 2.999782562255859E-03 seconds.
Trial 5
CPU Time: 3.999948501586914E-03 seconds.
Trial 6
CPU Time: 4.000186920166016E-03 seconds.
Trial 7
CPU Time: 3.999948501586914E-03 seconds.
Trial 8
CPU Time: 3.999948501586914E-03 seconds.
Trial 9
CPU Time: 2.999782562255859E-03 seconds.
Trial 10
CPU Time: 3.999948501586914E-03 seconds.
SELECT CASE: Average CPU Time 3.875017166137695E-03 seconds.
COMPUTED GOTO: Average CPU Time 3.874957561492920E-03 seconds.
Press any key to continue
-- end output --

Here's the full test case:
-- begin code --
module mykinds_m

!dec$ if defined (IFORT)

use, intrinsic :: iso_fortran_env, only : I4 => int32, I8 => int64, WP => real64

implicit none

!dec$ else

implicit none

integer, parameter :: I8 = selected_int_kind(10)
integer, parameter :: I4 = selected_int_kind(9)
integer, parameter :: WP = selected_real_kind(15,307)

!dec$ endif

real(WP), parameter :: ZERO = 0.0_wp

end module mykinds_m

module procs_m

use mykinds_m, only : WP, ZERO

implicit none

contains

subroutine s1( n, r )

include "i.f90"

end subroutine s1

subroutine s2( n, r )

include "i.f90"

end subroutine s2

subroutine s3( n, r )

include "i.f90"

end subroutine s3

subroutine s4( n, r )

include "i.f90"

end subroutine s4

subroutine s5( n, r )

include "i.f90"

end subroutine s5

subroutine s6( n, r )

include "i.f90"

end subroutine s6

subroutine s7( n, r )

include "i.f90"

end subroutine s7

subroutine s8( n, r )

include "i.f90"

end subroutine s8

subroutine s9( n, r )

include "i.f90"

end subroutine s9

subroutine s10( n, r )

include "i.f90"

end subroutine s10

end module procs_m

module m

use procs_m

implicit none

private

!dec$ if defined (IFORT)
character(len=*), parameter, public :: keywords(*) = [ character(len=3) :: "s1", "s2", "s3", &
"s4", "s5", "s6", "s7", "s8", "s9", "s10" ]
integer, parameter, public :: ikeys(*) = [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
!dec$ else
integer, parameter :: MAXKEYS = 10
character(len=3), parameter, public :: keywords(MAXKEYS) = [ "s1 ", "s2 ", "s3 ", &
"s4 ", "s5 ", "s6 ", "s7 ", "s8 ", "s9 ", "s10" ]
integer, parameter, public :: ikeys(MAXKEYS) = [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
!dec$ endif

public :: s_slc
public :: s_cgt

contains

subroutine s_slc( keyword, n, r )

character(len=*), intent(in) :: keyword
integer, intent(in) :: n
real(WP), intent(inout) :: r

select case ( keyword )

case ( keywords(1) )

call s1( n, r )

case ( keywords(2) )

call s2( n, r)

case ( keywords(3) )

call s3( n, r)

case ( keywords(4) )

call s4( n, r)

case ( keywords(5) )

call s5( n, r)

case ( keywords(6) )

call s6( n, r)

case ( keywords(7) )

call s7( n, r)

case ( keywords(8) )

call s8( n, r)

case ( keywords(9) )

call s9( n, r)

case ( keywords(10) )

call s10( n, r)

case default

print *, "s: invarid keyword of ", keyword
return

end select

return

end subroutine s_slc

subroutine s_cgt( ikey, n, r )

integer, intent(in) :: ikey
integer, intent(in) :: n
real(WP), intent(inout) :: r

goto (1,2,3,4,5,6,7,8,9,10), ikey

1 continue
call s1( n, r )
go to 99

2 continue
call s2( n, r )
go to 99

3 continue
call s3( n, r )
go to 99

4 continue
call s4( n, r )
go to 99

5 continue
call s5( n, r )
go to 99

6 continue
call s6( n, r )
go to 99

7 continue
call s7( n, r )
go to 99

8 continue
call s8( n, r )
go to 99

9 continue
call s9( n, r )
go to 99

10 continue
call s10( n, r )
go to 99 ! wonder if compiler optimization eliminates this

99 continue
return

end subroutine s_cgt

end module m

module cpu_m

use mykinds_m, only : IP => I8, WP

implicit none

contains

subroutine my_cpu_time( time )

!.. Argument list
real(WP), intent(inout) :: time

!.. Local variables
integer(IP) :: tick
integer(IP) :: rate

call system_clock (tick, rate)

time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) )

return

end subroutine my_cpu_time

end module cpu_m
! include i.f90

!.. Argument list
integer, intent(in) :: n
real(WP), intent(inout) :: r

!.. Local variables
integer :: v_size
integer :: istat
real(WP), allocatable :: v(:)

v_size = 2**( min(n,10) )
allocate( v(v_size), stat=istat )
if ( istat /= 0 ) then
print *, ": allocation of v failed: v_size, stat = ", v_size, istat
stop
end if

call random_number( v )

!dec$ if defined (IFORT)
r = norm2( v )
!dec$ else
r = sqrt( dot_product(v, v)/real(v_size, kind=kind(r)) )
!dec$ endif

deallocate( v, stat=istat )

return

! main program
program p

use mykinds_m, only : WP, ZERO
use m, only : s_slc, s_cgt, keywords
use cpu_m, only : my_cpu_time

implicit none

!..
integer, parameter :: MAXREPEAT = 10
integer, parameter :: MAXTRIAL = 2**10
integer :: Idx(MAXTRIAL)
integer :: Counter
integer :: j
real(WP) :: Start_Time = ZERO
real(WP) :: End_Time = ZERO
real(WP) :: Ave_Time = ZERO
real(WP) :: CpuTimes_SLC(MAXREPEAT)
real(WP) :: CpuTimes_CGT(MAXREPEAT)
real(WP) :: r(MAXTRIAL)
real(WP) :: x_norm
!dec$ if defined (IFORT)
character(len=*), parameter :: FMT_CPU = "(a, t40, g0, a)"
!dec$ else
character(len=22), parameter :: FMT_CPU = "(A, T40, 1PG22.15, A)"
!dec$ end if

print *, "Mythbuster #1: SELECT CASE vs COMPUTED GOTO"
print *
end program p
-- end code --

FortranFan

unread,
Oct 17, 2016, 1:50:50 AM10/17/16
to
On Saturday, October 15, 2016 at 9:13:23 PM UTC-4, campbel...@gmail.com wrote:

> ..
>
> I could not identify the version of the compiler and OS you are using but please re-do the test with I8 => int64 ; integer(I8) :: tick ..
>

@John Campbell,

Note "Upon execution with Intel Fortran with /O2 on a Windows 7 laptop, Intel i5 CPU 2.7 GHz, 8 GB machine" - I should have added compiler 17.0. Note I have posted the full code, so anyone can try it for themselves if they were so compelled.

I tried as you suggested with int64 as the integer type for tick actual argument in system_clock routine, but it made no discernible difference with the case I presented. But thanks much, I can see how it will help in other cases.


Stefano Zaghi

unread,
Oct 17, 2016, 8:48:56 AM10/17/16