Why is it not possible to have code (except declarations and MACROs) at
file scope outside the body of any function?
This code could initialize tables used by the module when it first gets
loaded. Like a BEGIN block in Perl.
Thanks.
Because the language grammar doesn't allow for it.
> This code could initialize tables used by the module when it first gets
> loaded. Like a BEGIN block in Perl.
Doesn't make sense, really. Modules aren't "loaded" in C the way they are
in an interpreted language. There isn't a specific point at which a given
".c" file is "loaded" -- all of them have been already loaded in arbitrary
and not necessarily consistent orders, and then the output of compilation
joined together. Basically, if you want initializers, write them. There
is some deep magic available to do this in a semi-automated way for some
compilers.
-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I don't agree. This is a very useful feature - otherwise either all your
initializations have to be constants (but often you want to read from the
registry for example) or you need have a special ModuleInit function and
hope all the clients remember to invoke it.....
The only thing needed: there is a moment when for the first time a
function in that module gets called. Before that happens, the BEGIN block
is guaranteed to have run. If no function is called, the BEGIN block may
get run or may not, no guarantees.
I think that is a simple but useful feature.
>> Why is it not possible to have code (except declarations and MACROs) at
>> file scope outside the body of any function?
>Because the language grammar doesn't allow for it.
Well duh.
>> This code could initialize tables used by the module when it first gets
>> loaded. Like a BEGIN block in Perl.
>Doesn't make sense, really. Modules aren't "loaded" in C the way they are
>in an interpreted language.
You're being too literal. It could perfectly well be executed program
start-up. There are certainly ordering issues, but there are plenty
of cases where they do not arise, for example building a table that is
only used in the same translation unit.
>Basically, if you want initializers, write them.
It's not writing them that's the problem, it's calling them.
>There
>is some deep magic available to do this in a semi-automated way for some
>compilers.
And this of course could be made a standard part of C.
Presumably the OP wants to know why it wasn't. I suspect the reason
is just historical: C was a intended as a small, simple language, and
no-one wrote linkers for it that provided the necessary support.
-- Richard
--
Please remember to mention me / in tapes you leave behind.
> I don't agree.
You don't get a vote in whether or not to agree; I was describing how the
language works. Now, you might wish it worked differently, but I wasn't
answering the question "should this work differently", but rather, "why
does the way it works now not allow code outside of functions".
> This is a very useful feature - otherwise either all your
> initializations have to be constants (but often you want to read from the
> registry for example) or you need have a special ModuleInit function and
> hope all the clients remember to invoke it.....
If you need to do that, you do it yourself, which is pretty easy.
> The only thing needed: there is a moment when for the first time a
> function in that module gets called. Before that happens, the BEGIN block
> is guaranteed to have run. If no function is called, the BEGIN block may
> get run or may not, no guarantees.
> I think that is a simple but useful feature.
It's a horrible misfeature for C.
Let's walk through why.
Let's say you wanted to do this. It's easy for you to do it yourself; you
start every function with
my_init();
which does initialization if it needs to.
Now... What happens to the performance of all of your functions when you add
that to them? Why, it gets worse. For every function. Meaning that you are
silently adding a significant cost to every function in the module.
That's not how C generally does things. If you want to call an initializer
at the beginning of every function in a module, you are welcome to do so, but
C does not favor "solutions" in which everyone in the entire world has to pay
a potentially significant cost for every program they ever build just because
you wanted this feature once.
In languages like perl or ruby, a feature like this makes sense, because each
module of code is "loaded" at a specific time. In C, you can do this by
writing your own little initializers all over, or you can do things that are
more clever but less portable.
If you want to see a reasonable example of how this might be solved in a real
system in C, in a way that is much cleaner and better defined than the vague
"let's just run code outside of functions", look at how the Linux kernel
handles initialization functions.
It's clearer, it's better-designed, and it has a number of other advantages
(such as allowing the kernel to dump the executable code for those functions
once it's done running them).
> You're being too literal. It could perfectly well be executed program
> start-up. There are certainly ordering issues, but there are plenty
> of cases where they do not arise, for example building a table that is
> only used in the same translation unit.
That still doesn't answer the question of when that translation unit is
"loaded". In perl or ruby, it'd be completely unambiguous; in C, it's not,
because C execution isn't defined in terms of the parser.
>>Basically, if you want initializers, write them.
> It's not writing them that's the problem, it's calling them.
That's easily done.
>>There
>>is some deep magic available to do this in a semi-automated way for some
>>compilers.
> And this of course could be made a standard part of C.
It could, in fact.
> Presumably the OP wants to know why it wasn't.
I don't think so. The OP seems totally uninterested in "an initialization
function which is guaranteed to be called before the execution of other code",
but rather, in "code outside of any function". At least, that was the
impression I got.
> I suspect the reason
> is just historical: C was a intended as a small, simple language, and
> no-one wrote linkers for it that provided the necessary support.
Pretty much. It turns out that there are not that many circumstances where
it's really necessary.
>>>Basically, if you want initializers, write them.
>> It's not writing them that's the problem, it's calling them.
>That's easily done.
An error I have made several times, and found several times in other
programs, is the failure to call a library's initialisation function.
Sometimes the mistake is rather hard to find, because many functions
may work perfectly well without the initialisation, and the error only
appears when another function is used.
>I don't think so. The OP seems totally uninterested in "an initialization
>function which is guaranteed to be called before the execution of other code",
>but rather, in "code outside of any function". At least, that was the
>impression I got.
I don't see much beyond a syntactic difference. The code outside
any function could be placed by the compiler in a function (or
possibly several functions) which is then called before other
functions in the file. Obviously there are a few tedious details
to work out.
C++ added the ability to run code before main to support initialisation
(calling constructors of) of static objects. But being constrained by
linker technology the order of execution of code in different
compilation units is unspecified. So the feature is of use, but it is
far from perfect and we have to work around the ordering issues.
> Basically, if you want initializers, write them. There
> is some deep magic available to do this in a semi-automated way for some
> compilers.
The age old problem of failing to call them still catches out the
unwary, but doing so is probably easier to spot than the unpleasant
ordering issues that C++ has.
--
Ian Collins
But you can make all the functions in the library call the initialization
function conditionally easily enough.
> I don't see much beyond a syntactic difference. The code outside
> any function could be placed by the compiler in a function (or
> possibly several functions) which is then called before other
> functions in the file. Obviously there are a few tedious details
> to work out.
Not necessarily. In ruby, for instance, the placement of the code outside
the function definitions can matter. I think it may also in perl, though
I'm less sure.
But it does seem like at some level it's primarily a question of syntactic
sugar, but C's design makes the syntactic sugar extremely hard to guess
at.
>> The only thing needed: there is a moment when for the first time a
>> function in that module gets called. Before that happens, the BEGIN block
>> is guaranteed to have run. If no function is called, the BEGIN block may
>> get run or may not, no guarantees.
>
>> I think that is a simple but useful feature.
>
> It's a horrible misfeature for C.
>
> Let's walk through why.
>
> Let's say you wanted to do this. It's easy for you to do it yourself; you
> start every function with
> my_init();
> which does initialization if it needs to.
>
> Now... What happens to the performance of all of your functions when you
> add
> that to them? Why, it gets worse. For every function. Meaning that you
> are
> silently adding a significant cost to every function in the module.
That's not what was suggested, which was a single, one-time initialisation
function for the module.
--
Bartc
Unless you are doing some serious crazy linker magic, the only way you can
achieve that is to have every function in the module check to see whether it's
called and if not, call it. Every time it's called.
There is crazy linker magic to let you bypass this, but it ain't pretty.
This feature appears in large and feature-laden offshoot dialect of C
known as C++.
C++ external definitions can have initializers which are not constant
expressions, and can call functions. C++ programs can also have external
definitions for objects whose constructors must be called.
All of that happens before the execution of main.
C++ does not solve the module ordering problem. If such initializations appear
in different translation units of the program, you don't know which ones happen
first. This ambiguity can be a source of bugs in C++ programs.
If you really need this feature, just use C++. Your code will be less portable
than if it were written in the 1990 dialect of C, but C++ is quite popular and
compiler availability is good.
You also said it doesn't make sense. That's a kind of vote. If you get a vote,
then sandeep gets a vote too.
Look at the original post: what was needed was an initialisation routine for
the module. That's called once as the module gets 'loaded'. That can do
runtime initialisations of file scope variables, before any of the functions
in the module are called.
But there is no explicit loading in C (unless the module is made into a
dynamic library). The same thing can be achieved by providing such an
init-function for the module, and arranging to have it called when the
entire application starts.
--
Bartc
I got the impression that the big focus wasn't "an initialization routine"
but "the ability to just put code in outside of any functions" -- that the
syntactic sugar of having this implicitly happen to code that wasn't in a
function was more significant to the request, and an initialization routine,
even one which magically "happened first", would not be an acceptable
substitute.
> But there is no explicit loading in C (unless the module is made into a
> dynamic library). The same thing can be achieved by providing such an
> init-function for the module, and arranging to have it called when the
> entire application starts.
Yes. The tricky part is how you do that; in particular, how do you make it
so the module itself can arrange to have the function called, without code
elsewhere having to know that you need that done?
The answer is some pretty hairy linker magic, and it ain't portable.
>> But there is no explicit loading in C (unless the module is made into a
>> dynamic library). The same thing can be achieved by providing such an
>> init-function for the module, and arranging to have it called when the
>> entire application starts.
>
> Yes. The tricky part is how you do that; in particular, how do you make
> it
> so the module itself can arrange to have the function called, without code
> elsewhere having to know that you need that done?
>
> The answer is some pretty hairy linker magic, and it ain't portable.
To do it automatically, perhaps.
But if someone is writing such a module to be staticallly linked to an
application, then this will just be part of the API: call such and such
init-function before doing anything else.
(Perhaps call it load_modulename() to make it appear it's doing something
crucial. After all those other languages mentioned probably also have to
explicitly import or load a module before use.)
(The init-function will likely have logic to allow it to be called more than
once but only actually do the init once; and if it depends on other similar
modules to also call the init-function in those.)
--
Bartc
The use of init (and finish) functions is common in drivers, where the
"application" truly is a module. The init function gets called when the
driver is loaded.
--
Ian Collins