I wanted to know which is better for usage in Cobol code between
Evaluate and If statements.
Thanks in advance.
Regards!
It depends on the circumstances. IF is better in some cases, EVALUATE is
better in others. What do you want to do?
To add to what Doug said, my advice is to use the statement that makes the program easiest for a programmer to read and understand.
If the point of your question is about code efficiency, you almost certainly can ignore that. Unless your program is very unusual, it spends at most only a couple of percent of its execution time in the code you write. Measurements have shown that all typical business applications spend most of their time executing operating system instructions. So minor differences in how you code the computations in your program have an unnoticable effect on the CPU consumption of your process.
There can be cases where what look like minor differences in the program code make large differences in whether OS calls are involved, but the difference between doing conditional code with IF vs. with EVALUATE is not one of those cases.
Besides how easy it is to understand the program, the only other factor that might influence whether to use IF or EVALUATE might be whether the debugger handles one better than another when you are stepping through the code or setting breakpoints. I don't know whether there is any difference in that regard. Perhaps someone else who has had experience with that can comment on it.
I would give a rath simple answer:
If you have only 2 possible values (TRUE/FALSE), us if. If there are
several possible values, use evaluate, it makes the program more
readable. I do not think there is any performance issue.
Wolfgang
>I would give a rath simple answer:
>If you have only 2 possible values (TRUE/FALSE), us if. If there are
>several possible values, use evaluate, it makes the program more
>readable. I do not think there is any performance issue.
I think that's too simple an answer; the question is much too broad for any
one-size-fits-all answer.
In order to give the OP a really useful answer to his question, we need to
know the circumstances in which he's contemplating using the two statements.
Since he hasn't responded to my question asking just that, I'm beginning to
suspect that the OP didn't actually have any real-world example, and was just
looking for a simple answer that he could use in hopes of passing a technical
interview somewhere.
Also consider that the native compilers have a restriction on nested
if statements so you may hit a point where you have no choice but to
use evaluate.
Dave
Any procedure so complex that it maxes out the limit on nested IFs is a
procedure crying out to be rewritten.
Trust me it was.
Which is a better option here? Using Evaluate (this will make the code
long) or IF(low readability in comparison to Evaluate statement)
Why? What performance problems are you seeing, and why do you suppose they
might be related to the choice of IF vs. EVALUATE? (Hint: if you think that
either one has anything to do with any performance problems you're observing,
you're almost certainly wrong.)
>I had to either use 4 if`s(if, else if, else if,else) or Evaluate
>condition. The second problem i was facing was that i could not use
>any operator (OR or AND) with in Evaluate condition
Why not?
>which was making
>the Evaluate statements way too long.
What do you mean by "way too long"? And why is that a problem?
>Although Evaluate was looking better than if (I mean readability
>here).
Readability is a good thing.
>
>Which is a better option here? Using Evaluate (this will make the code
>long) or IF(low readability in comparison to Evaluate statement)
Probably EVALUATE, but you still haven't provided enough information to give a
good, clear answer to the question. Please post all of the following:
a) A description of what is to be accomplished by the code in question
b) The compound IF...ELSE...IF...ELSE... statement that you wrote to
accomplish it
c) The EVALUATE statement that you wrote to accomplish it.
Oh, I see Doug replied while I was busy composing this, so there is some overlap between our answers. Maybe seeing it twice in different words will be helpful.
The short answer is: Unless your program is very unusual, the difference in performance will be almost undetectable, so use the form that is most readable.
Actual measurements done of a wide range of typical business applications running on Tandem systems quite a few years ago showed that the amount of CPU time spent in user-written code was at most a couple of percent of the total CPU time used by the application. (Note, this was CPU time, NOT including any I/O wait time.) This was true for all of the applications that were measured. So choices like the one you are asking about have such a small effect on the efficiency of the program that the difference is not important. I cannot tell you exactly how small the percentage of time in user-written code was. Those experiments were done long ago, and I did not do them. I just remember reading the report of the results.
Only if your program is unusual and spends a large percent of its CPU time actually executing instructions compiled from your source code, does the efficiency of your code matter. As soon as your code calls a COBOL library function (or calls an operating system function), your process usually ends up executing many thousands of instructions in the library or in the operating system before it returns to your code. Your code then usually executes only a few to a few dozen instructions before calling a library function again, which ends up executing many thousands more instructions in the library again. Suppose the IF statements in your program take 50 instructions to evaluate, while the EVALUATE statements take 150 instructions to evaluate. Then the execution time ratio for that small section of the program that includes the two library function calls would be something like (5000+50)/(5000+150), or about 98%. But the rest of the program -- many, many thousands of instruc
tions -- would be the same in both versions of the program, which would mask the 2% difference in that one small part of the program. (I am making up those figures, so I am sure they are not correct, but they do illustrate how it can be true that the amount of user-written code is insignificant.)
If your program were doing some intensive calculation -- something like solving a system of linear equations (inverting a matrix), or computing an MD5 checksum of a large file, or doing zip file compression, or doing video encoding -- then your program would be executing a much larger portion of its instructions in your user-written code and it would be sensible to give some thought to writing code that generated the fewest number of instructions. But those kinds of calculation-intensive programs are not typical business applications, so the rule of thumb does not apply to them. If your program is not a typical one, then perhaps you are correct to be concerned about the relative code efficiency between IF and EVALUATE. But my guess is that your program is not an exception.
Do be careful not to misinterpret what I am saying. I am NOT saying that there is no difference in performance between a program written such that it executes X number of READ statements vs. a program that executes 10*X number of READ statements to produce the same amount of useful results. The READ statement is a call of a runtime library function and involves executing a very large number of instructions. IF and EVALUATE statements do not involve calls of library functions. It is true that the expressions evaluated during the course of executing an IF or EVALUATE could call library functions -- for instance, to do string comparisons or arithmetic on data of type DISPLAY NUMERIC. But both versions of the program usually would involve the same or very similar expressions, so those library calls would not cause a big difference in the CPU time between the two program. Of course, if using one form (using IF or using EVALUATE) causes more expressions to be evaluated than
using the other form does, then it is reasonable to start thinking about whether one form performs better than the other. However, if the extra evaluations are of the same expressions, then the compiler's optimizer probably will avoid computing any given expression more than once, so the extra evaluations do not actually occur.
Now, let's move on to what you said about not being able to use AND and OR with EVALUATE. The ALSO keyword in EVALUATE is equivalent to AND, at least in the simple cases I'm thinking of. If the overall condition has the form of some number of terms connected by AND at the first level, that fits with the ALSO pretty well. It is true that to do OR, you must add additional WHEN lines to the statement, but the extra code involved in implementing the extra WHEN lines is exactly the sort of extra user-written code I show above doesn't really make enough difference overall to be concerned about.
I am not very experienced at using EVALUATE, so it could be that there are some kinds of tests that are not good candidates for doing with EVALUATE, but I suggest that you do not dismiss EVALUATE too quickly. I believe it is true that anything that can be expressed as a decision table is a good candidate for implementing with EVALUATE. That covers a lot of ground. If there is some set of tests that you cannot see how to implement in a reasonable way with EVALUATE, it might be worthwhile to explain here what the tests must determine and perhaps what trouble you are having in expressing it with EVALUATE. Then someone here might be able to show you a reasonable way to solve it using EVALUATE.
Use the following:
EVALUATE TRUE
WHEN THIS OR THAT PERFORM THIS-OR-THAT
WHEN BLAH AND BLAH PERFORM BLAH-BLAH
WHEN OTHER PERFORM PUNT
END-EVALUATE
When doing compare or moves, the SIZE of the fields makes a huge difference
in performance. In other words, comparing two 256 byte fields is
signficantly slower then comparing two 1 byte fields. Ditto for MOVES.
Sometimes, just some simple restructuring of the code helps. For example:
Given the following:
01 RECORD-TYPE
PIC X(100).
01 FIELD-TYPE
PIC X(100).
IF RECORD-TYPE = "A" and FIELD-TYPE = "B"
---- do something ----
ELSE
IF RECORD-TYPE = "A" and FIELD-TYPE = "C"
---- do something -----
ELSE
IF RECORD-TYPE = "A" and FIELD-TYPE = "D"
----- do something -----
END-IF.
is bad.
Better:
IF RECORD-TYPE = "A"
IF FIELD-TYPE = "B"
--- do something ----
ELSE
IF FIELD-TYPE = "C"
---- do something ----
ELSE
IF FIELD-TYPE = "D"
---- do something ----
.
Now, if you KNOW for sure that the values in RECORD-TYPE will never contain
embedded spaces and the value in FIELD-TYPE will never contain
embedded spaces, then:
Best:
IF RECORD-TYPE (1:2) = "A "
IF FIELD-TYPE (1:2) = "B "
---- etc ----
Too often I have seen this:
01 COUNTERS.
05 COUNTER OCCURS 100 TMIES PIC S9(4) COMP.
01 COUNTER-SUB PIC
S9(4) COMP.
MOVE 1 TO COUNTER-SUB.
PERFORM UNTIL COUNTER-SUB > 100
MOVE 0 TO COUNTER (COUNTER-SUB)
ADD 1 TO COUNTER-SUB
END-PERFORM
Better:
MOVE LOW-VALUES TO COUNTERS.
Are you using the COBOL INITIALIZE verb to initialize large structures? Is
it really necessary? Better would be to initialize the area once, move the
initialized area to a hold area and the move the hold area back to the
structure when you need to re-initialize.
Are you using COMP instead of just numeric display when you should be?
Are you going through a bunch of initialization logic for each transaction
when in fact, you could get away with doing it only for the first
transaction?
There are a gazillion other examples. Now, if you have a low transaction
volume system then what you do in your code will not make that much
difference, but will make some. Take that same code and run it a billion
times a day and little things will turn into big things.
Sometimes just restructuring your SQL can also make a huge difference. The
trick here is run your SQLCOMPs with EXPLAIN and then actually look at the
output and try to understand what the SQL engine is doing on your behalf.
Things to look for:
1) Are the indices that you specified for the table being accessed the way
you think they should be? Out of all the SQL statements in your
application, are ALL the indicies that you defined being accessed?
2) Are you doing full table scans?
3) Do you have STATISTICS updated for the tables being accessed? Are they
current?
4) Are you doing sorts when you do not think you should be?
5) Are you using COUNT(*) to see if rows are present when you really do not
care how many?
6) Are you selecting / updating more columns then you need to be?
7) If you are using dynamic SQL (PREPARE, DESCRIBE, EXECUTE, ...) are you
doing the PREPARE and DESCRIBE more often then necessary?
8) If you see a bunch of messages saying that a program is being dynamically
re-sql-compiled, figure out why. Should you have similarity check enabled?
9) If you are doing a bunch of internal table searching, are you using the
optimal search technique (sequential, binary, hash, segmented, ....)?
10) If you are sharing memory between processes and need to establish
semaphores: Are you releasing those semaphores as soon as possible?
As much as I like COBOL 88 levels, they are not very efficient. they also
generate procedural code even if you do not reference them and in affect,
you are performing some generated routine when you reference them.
Another one of my pet peeves:
BAD:, VERY VERY BAD:
01 LIT-Y PIC X(01) VALUE "Y"
MOVE LIT-Y TO SOME-VARIABLE.
I know all the hype about having hard coded literals in procedural code but
the above technique is really poor. the idea behind this is that you can
change the definition of LIT-Y to some other value (say "B") and not have to
change a bunch of procedural code. HOWEVER, what we have done instead is
more or less hard code the value in the field name. If you are going to do
something like this, then a better approach would be to:
01 LIT-TRUE PIC X(01) VALUE "Y".
MOVE LIT-TRUE TO SOME-VARIABLE.
There is more to performance then speed (stuff like readability,
maintainability, time to market, testability, ....).
Just my $0.02. You can keep the change.
CJ
==
I do not doubt that your heart is in the right place -- you want people to know what matters about making their applications run well and efficiently. It just seems to me that your experiences don't match what the studies that Tandem Software Development did many years ago showed, and I would like to understand why there is that difference.
The advice you give could be helpful in some cases, but, at least according to those studies I mentioned, cases where it would be helpful were very rare in the typical customer applications that were studied. I imagine that your experience with hundreds of benchmarks over the years should mostly have been with typical applications, so I wonder why what you have seen differs so much from those studies. Could it be that you somehow only got involved with applications that were not typical? Could it be that the applications studied by software development were particularly well-coded, and so were not really typical? I think they tried to be not terribly selective. Could it be that what is a typical application now is different in important ways from what was typical then?
If you limit your attention to the time spent executing the instructions compiled directly from the user-written code rather than total CPU consumption, then the sort of things you say are true -- the exact way you code things can make a big difference in the amount of CPU time consumed directly by the user-written code. But what those old studies showed was that if you look at the total CPU consumption, those kinds of differences disappear into the noise.
The reason is that such a small proportion of the total CPU consumption occurs directly in the user-written code. The typical program executes a few hundred instructions between calls to library or operating system functions, which, in turn execute many thousands of instructions. That is why the overwhelming number of instructions executed are not from user-written statements. Those studies done by software development showed under 2% of CPU time in typical applications was in user-written statements. So if you made some change that reduced the CPU consumption of the user code by 50%, you could, at most, improved the total CPU consumption by less than 1%. I don't have copies of those old studies, so the "less than 2%" I am quoting is from memory. It is possible that it was a few per cent higher than that, but, if so, not much higher. It was a remarkably small number.
Thinking back to the measurements you have done, were you focusing only on the CPU time spent executing instructions compiled from user-written code, or were you measuring total CPU consumption? Or was the typical program you studied one that serviced most transactions using data held entirely within its memory, not communicating with other processes? Either of those would lead you to conclusions different from what I am quoting.
When you move into discussing how to structure use of SQL statements, you are getting off into a completely different area than what I (and the original poster) are talking about. What you say there seems to be accurate (I didn't look at it closely, but you do mention the right things), but really isn't very relevant to the question of whether it is better to use IF or EVALUATE.
Some of the specific coding examples you give seem kind of odd to me, and I wonder whether I should discuss them at all. Maybe you just didn't give much thought to composing the examples.
For instance, who would use 100-byte fields as record types and field types? People naturally define fields just a couple of bytes long for such uses.
The example about avoiding repetition of the test of record type in successive IFs: I haven't looked at any generated code recently, but I would expect the compiler to lift that common subexpression to the beginning of the code block and evaluate it only once. In Tandem's early years, the COBOL compiler didn't do very much optimization, but I believe all the recent COBOL compilers do the typical optimizations that most compilers do these days. (I think it doesn't actually matter very much whether that subexpression is evaluated multiple times, but your argument is that it does matter, and I expect that one doesn't have to rewrite the code in order to avoid reevaluating it.)
As to initializing the table of counters: If that is done once at the beginning of execution of a program, it isn't going to matter at all how quickly the initialization is done. If it is something that is done once per transaction, then it might be important to make the initialization run faster, depending on what else the transaction does, but I believe it would not matter there, either, in most cases.
But setting aside those specific examples, you are correct that some differences in the way the user code is written can make a large difference in how much CPU time is spent executing the code compiled from those user-written statements. I believe, based on those old studies, that in typical Tandem applications, the amount of CPU time spent doing that is essentially irrelevant, but your experience seems to be different. Can you think of reasons why you experience is different?
=====================================================================================================================
Well, if it did not matter what the application did, then sizing a system
would be much easier then it really is. I have seen signficant reductions
in both CPU time and elapsed time by carefully scutinizing and tweaking what
the application code does and cited some real world examples.