Regards.
It doesn't. On fork, the page tables of the parent are copied and the
access permissions of the used memory pages are changed to 'read
only'. If either parent or child try to write to a memory location in
a shared page, this will result in a page fault and then, the kernel
fault handler will create a copy of this particular page. But if you
fork, anyway, you could use one of the exec routines to start the
other program, which would then replace the forked copy of the
original process. That would save one fork (system forks internally
and waits for the child to terminate). If you want to really execute a
shell command, a possibilty would be
execl("/bin/sh", "sh", "-c", "<command goes here>", (void *)0);
Regards.
No it doesn't, it just fakes it. any memory not written by the process
before calling system isn't duplicated.
> Is there a possibility to start a program without forking and without
> blocking?
no, forking is the only way to start a process.
> I'm thinking to try pthread_create() and then system(). Is this a good
> idea?
system does execve... dunno if that's thread safe.
Bye.
Jasen
[...]
>> I'm thinking to try pthread_create() and then system(). Is this a good
>> idea?
>
> system does execve... dunno if that's thread safe.
system forks and calls execve in a new process.
> This works fine but a disadvantage is that fork() doubles the memory-
> usage.
Virtual memory is not usually considered a scarce resource.
DS
But virtual memory is a scarce resource, because there is a fixed
address space size limit and, depending on the overcommit settings of
the kernel, a fixed limit for the total amount of available virtual
memory as well. That you usually don't have to deal with a particular
situation does not mean it nevers affects others and application run
on a lot of different OS installations (eg on systems without any
paging areas, which implies that actually unused, but allocated pages
can not be reused for something different because their contents
cannot be stored anywhere).
'Just waste it, who cares' is at best an advice that will not have a
negative impact. There is a reason why users continously need to buy
faster machines to run the same tasks with the same sucky performance.
> > Virtual memory is not usually considered a scarce resource.
> But virtual memory is a scarce resource, because there is a fixed
> address space size limit
That's per-process, not for the system. So worrying about virtual
memory consumption of a 'fork' because of this makes no sense.
> and, depending on the overcommit settings of
> the kernel, a fixed limit for the total amount of available virtual
> memory as well.
Right, but that limit is huge because it's based on available *disk*
space, not physical memory.
> 'Just waste it, who cares' is at best an advice that will not have a
> negative impact. There is a reason why users continously need to buy
> faster machines to run the same tasks with the same sucky performance.
It is not waste, it is use. And, in my experience, the problem of
sucky performance on fast machines is largely due to *underuse* of
available resources, not overuse. (That is, software that was designed
to conserve its use of resources such as memory at some performance
cost are still conserving their use even when the resource is
plentiful.)
DS
Not much.
>> and, depending on the overcommit settings of
>> the kernel, a fixed limit for the total amount of available virtual
>> memory as well.
>
> Right, but that limit is huge because it's based on available *disk*
> space, not physical memory.
It is based on configured swap space and physical RAM. Neither number
is easily available to someone developing an application intended to
be used by other people than himself, on other computers.
>> 'Just waste it, who cares' is at best an advice that will not have a
>> negative impact. There is a reason why users continously need to buy
>> faster machines to run the same tasks with the same sucky performance.
>
> It is not waste, it is use.
To be use, it must be actually useful to a user, and not just for some
developer being able to save his or her time this way. And the system
were this 'use' occurs must be able to actually reuse inactive memory,
ie do paging in the first place. Diskless embedded systems typically
aren't, but can easily be as capable as 'workstations' of ten years
ago in terms of CPU power and RAM.
> And, in my experience, the problem of
> sucky performance on fast machines is largely due to *underuse* of
> available resources, not overuse. (That is, software that was designed
> to conserve its use of resources such as memory at some performance
> cost are still conserving their use even when the resource is
> plentiful.)
It is not uncommon for current desktop machine to still make heavy use
of paging. But what exactly where those experiences?
> It is not uncommon for current desktop machine to still make heavy use
> of paging. But what exactly where those experiences?
Exactly as I said, software that was designed to conserve resources
that performs poorly in contexts where that resource is no longer
scarce. A classic example is very small hash tables. It is, at least
in my experience, the most common cause of poor performance on beefy
hardware. The size/speed tradeoffs you make when the average machine
has 64MB are not the same as the size/speed tradeoffs you make when
the average machine has 1GB. A lot of software has not caught up.
Worrying about reasonable uses of memory just makes this problem
worse.
DS
This is a repetition of your original statement, not an explanation.
> A classic example is very small hash tables. It is, at least
> in my experience, the most common cause of poor performance on beefy
> hardware.
And that is another repetition, again refering to (still nameless)
'experiences'. It even isn't a sensible statement of its own, because
the average lookup speed of a hash table depends on its fullness and
not on its size. Additionally, even the 133Mhz processor used on the
board sitting next to me can traverse 2^16 element linear linked list
in a time that is way too small to be reliabe measurable.
> The size/speed tradeoffs you make when the average machine
> has 64MB are not the same as the size/speed tradeoffs you make when
> the average machine has 1GB. A lot of software has not caught up.
>
> Worrying about reasonable uses of memory just makes this problem
> worse.
Example please. Talk is cheap.
Example: refusal to use standard libraries for managing data --- such as the
STL and map<> --- because you think they're too big, and instead writing your
own version, that's (a) buggy, (b) slower than the STL version you're
replacing, (c) insecure, and (d) wastes valuable developer time that could
instead be used on the application.
I've seen this happen many times, and indeed, have perpetrated it a few times.
Computer science has more reinvented wheels than any other technology known to
mankind --- most of them are square, too.
--
┌── dg@cowlark.com ─── http://www.cowlark.com ───────────────────
│
│ "The first 90% of the code takes the first 90% of the time. The other 10%
│ takes the other 90% of the time." --- Anonymous
> > A classic example is very small hash tables. It is, at least
> > in my experience, the most common cause of poor performance on beefy
> > hardware.
> And that is another repetition, again refering to (still nameless)
> 'experiences'. It even isn't a sensible statement of its own, because
> the average lookup speed of a hash table depends on its fullness and
> not on its size.
The average lookup speed of a hash table depends primarily on the
number of entries in each bucket. Using too few buckets, because you
were expecting to be running on a computer with 64MB and then running
on a computer with 2GB can *definitely* lead to poor performance.
> Additionally, even the 133Mhz processor used on the
> board sitting next to me can traverse 2^16 element linear linked list
> in a time that is way too small to be reliabe measurable.
Sure, if you only need to traverse it once. But what about when you
have to traverse it for each operation you are processing, and you are
expected to handle 160 a second?
> > Worrying about reasonable uses of memory just makes this problem
> > worse.
> Example please. Talk is cheap.
I honestly don't see how one example would help if you don't grasp the
general class. This is a *huge* issue that I deal with fixing
performance issues on a daily basis. There's still a ton of software
out there that was written with a 133Mhz 32MB mentality that's running
on a 2.4 Ghz 1GB machine, and its performance is crap because of it.
IMO, that's a far more common problem than resource wasting. It's hard
to waste resources as badly as many programs fail to use them.
I think a lot of the blame goes to people who constantly caution
programmers about excessive resource consumption when they should be
talking about appropriate resource sizing. You keep pushing people in
one direction when they should be moving in the other direction.
Resources are getting bigger, not smaller. People need to learn to use
them, they already have had to conserve them. (And performance
suffers, development time suffers, and too clever tricks lead to
bugs.)
The computers are fast enough and have enough memory and hard drive
space now that the vast majority of problems can be and should be
solved in a fairly straightforward way.
A great example from just last week was a memory manager that had an
internal trigger to cause it to try to compress, move, and return
memory to the OS when it hard a particular amount of free memory
sitting around in its pools. That amount was tuned based on the amount
of physical memory present, but unfortunately hardcoded not to exceed
1MB. Needless to say, that resulted in some wasteful thrashing.
A certain programmer, back in October of 2000, thought that 'wasting'
more than 1MB of memory was never a good thing. They were wrong. The
result was a library that gradually got worse and worse as its memory
savings grew less and less significant and its thrashing more and more
so.
I can't even count the number of bugs that result from cleverness in
trying to save memory or CPU cycles where they don't matter one bit.
If you add to that the wasted effort trying to *understand* such
nonsense so one can clean it up, ...
DS
This is (again!) a purely theoretical coniecture and not an
example. Additionally, it doesn't have the slightest relation to
either my original posting or to David's claim about undersized hash
tables. And a scenario where the STL variant would need to be
replaced because it is a) buggy, b) slow, c) insecure due to a) and
therefore d) wastes developer time by forcing them to code around
library quirks instead developing the application is just as
conceivable (and as useless).
> Computer science has more reinvented wheels than any other
> technology known to mankind --- most of them are square, too.
The talk about the 'reinvented wheel' is a surefire indicator of
superficial thinking: Humans invent lots of new things all the time
and very little of them are used unchanged for even a dozen years, let
alone a couple of thousands.
Collision resolution by chaining is only one way to implement hashing.
Apart from that, you are quoting me (a sensible definition of
'fullness' would be number of entries divided by number of buckets).
>> Additionally, even the 133Mhz processor used on the
>> board sitting next to me can traverse 2^16 element linear linked list
>> in a time that is way too small to be reliabe measurable.
>
> Sure, if you only need to traverse it once. But what about when you
> have to traverse it for each operation you are processing, and you are
> expected to handle 160 a second?
If I cannot measure something reliably because the quantity is too
small, I would expect that I may be able to measure quantity that is
two orders of magnitude larger, but still having to approximate it.
>> > Worrying about reasonable uses of memory just makes this problem
>> > worse.
>
>> Example please. Talk is cheap.
>
> I honestly don't see how one example would help if you don't grasp the
> general class.
If you cannot give an example, that somehow weakens your claim, dont
you think so?
> If you cannot give an example, that somehow weakens your claim, dont
> you think so?
There is no one perfect example. There are many dozens of examples
that approach the claim from all sides.
DS
as can making the opposite mistake, ideally the number of buckets should
slightly exceed the number of entries, a hash table that overflows into swap
will slow you down bigtime.
>> Additionally, even the 133Mhz processor used on the
>> board sitting next to me can traverse 2^16 element linear linked list
>> in a time that is way too small to be reliabe measurable.
>
> Sure, if you only need to traverse it once. But what about when you
> have to traverse it for each operation you are processing, and you are
> expected to handle 160 a second?
yeah, I implemented a qsort in qbasic (which is interpreted) and it was
(with the right dataset) faster than the ms-dos 5 sort program (which was compiled)
Bye.
Jasen
There are lots of things some people really believe in that others
cannot see.
I'm sorry, I've no idea what you're trying to say here. Can you clarify? Are
you arguing *for* writing your own data management routines, or *against* it?
'For' may give you more optimal results, but frequently doesn't due to an
insufficient understanding of the problem space, and also runs the risk of
introducing new bugs.
'Against' will produce slower code, but you don't have the reliability issues
that you do with custom data management code, and you save vast amounts of
programmer time that can be put to use elsewhere.
The original context was the fairly cryptic remark that 'virtual
memory is not usually considered to be a scarce ressource' and the
added claim that not wasting memory would often cause performance
degradation. I objected to this, because 'virtual memory' is finite
and thus, exhaustable, and because I don't consider waste (ie
pointless abuse) of anything to be a sensible strategy.
In this context, my answer to your question would be 'neither of
both'. Generally, I would argue for not trying to solve the imaginary
performance problem, ie don't try to make it faster at the expense of
additional complexity unless it is proven that a simpler
implementation is too slow, but additionally, to avoid compiled
languages altogether if there isn't a reason to take control of fairly
low-level details like this.
Ah, right.
Do bear in mind, though, that it's frequently more efficient to waste
resources than to consume them. The traditional way to write compilers was to
not bother freeing data, ever; given the way the programs worked ---
constructing big trees of data and then traversing them --- the extra overhead
of keeping track of memory vastly outweighed the savings in freeing unused
memory. Given that the OS would clean up after you, it simply wasn't worth the
bother.
[...]
> ...avoid compiled
> languages altogether if there isn't a reason to take control of fairly
> low-level details like this.
My thoughts exactly. I tend to write stuff in Lua these days; it's a very
small, very fast interpreted language along the same lines as Javascript and
Python. It's got a liberal license, it's small enough that I can understand
the entire language, it's easily extensible in C or C++, and it general it's
good enough to handle anything I can throw at it. I've just written a
bouncing-around-in-space game for PDAs where all the game logic, collision
detection, main loop, redraw etc code was in Lua, and the only stuff in C++
was the actual low-level drawing code. The reduced complexity meant that I
could write the entire game in three days. That's hard to beat.
--
┌── dg@cowlark.com ─── http://www.cowlark.com ───────────────────
│ "Wizards get cranky, / Dark days dawn, / Riders smell manky, / The road
│ goes on. / Omens are lowering, / Elves go West; / The Shire needs
│ scouring, / You may as well quest." - John M. Ford