As anybody knows here, one needs to be careful with GFP_xxx
flags. I've no doubts it's important to get it right in fs
and elsewhere in critical places or else nasty things will happen.
However, for driver code it seems like questionaire
"do you remember which network callback is atomic?".
It struck me that kernel actually can figure out whether it's okay
to sleep or not by looking at combination of (flags & __GFP_WAIT)
and ((in_atomic() || irqs_disabled()) as it already does this for
might_sleep() barfing:
kmalloc => __cache_alloc =>
static inline void
cache_alloc_debugcheck_before(kmem_cache_t *cachep, unsigned int __nocast flags)
{
might_sleep_if(flags & __GFP_WAIT);
#if DEBUG
kmem_flagcheck(cachep, flags);
#endif
}
and
void __might_sleep(char *file, int line)
{
#if defined(in_atomic)
static unsigned long prev_jiffy; /* ratelimiting */
if ((in_atomic() || irqs_disabled()) &&
system_state == SYSTEM_RUNNING && !oops_in_progress) {
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
return;
prev_jiffy = jiffies;
printk(KERN_ERR "Debug: sleeping function called from invalid"
" context at %s:%d\n", file, line);
printk("in_atomic():%d, irqs_disabled():%d\n",
in_atomic(), irqs_disabled());
dump_stack();
}
#endif
}
So why can't we have kmalloc_auto(size) which does GFP_KERNEL alloc
if called from non-atomic context and GFP_ATOMIC one otherwise?
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
that is not enough.
you could be holding a spinlock for example!
(and no that doesn't set in_atomic() always)
Because it's a lot better in generel if we force people to think about
what they are doing wrt memory allocations. You should know if you are
able to block or not, a lot of functions exported require you to have
this knowledge anyways. Adding these auto-detection type functions
encourages bad programming, imho.
--
Jens Axboe
but it sets irqs_disabled() IIRC.
--
vda
Sure, you can't stop people from doing bad programming. But I don't
think we should aid them along the way.
Those 'bad programming' people can simply use GFP_ATOMIC always, no?
This would be even worse because kmalloc_auto() will sleep
if it's allowed, but GFP_ATOMIC would not.
--
vda
only spin_lock_irq() and co do.
not the simple spin_lock()
On Wed, 29 Jun 2005, Arjan van de Ven wrote:
> >
> > but it sets irqs_disabled() IIRC.
>
> only spin_lock_irq() and co do.
> not the simple spin_lock()
>
It may be dangerous to use spin_lock with interrupts enabled, since you
have to make sure that no interrupt ever grabs that lock. Although I do
recall seeing a few locks like this. But even so, you can transfer the
latency of the interrupts going off while holding that lock to another CPU
which IMHO is a bad thing. Also a simple spin_lock would disable
preemption with CONFIG_PREEMPT set and that would make in_atomic fail.
But to implement a kmalloc_auto you would always need to have a preempt
count.
I'm not for a kmalloc_auto, but something like it would be useful for a
function that can work for either context, and just fail nicely if the
ATOMIC is set and the malloc can't get memory. A function like this would
currently have to always use ATOMIC even if it could have used KERNEL for
some scenarios, since it would suffer the same pitfalls as a kmalloc_auto
in determining its context.
-- Steve
This is why I always use _irqsave. Less error prone.
And locking is a very easy to get 'slightly' wrong, thus
I trade 0.1% of performance for code simplicity.
> preemption with CONFIG_PREEMPT set and that would make in_atomic fail.
> But to implement a kmalloc_auto you would always need to have a preempt
> count.
>
> I'm not for a kmalloc_auto, but something like it would be useful for a
> function that can work for either context, and just fail nicely if the
> ATOMIC is set and the malloc can't get memory. A function like this would
> currently have to always use ATOMIC even if it could have used KERNEL for
> some scenarios, since it would suffer the same pitfalls as a kmalloc_auto
> in determining its context.
This is more or less what I meant. Why think about each kmalloc and when you
eventually did get it right: "Aha, we _sometimes_ get called from spinlocked code,
GFP_ATOMIC then" - you still do atomic alloc even if cases when you
were _not_ called from locked code! Thus you needed to think longer and got
code which is worse.
--
vda
But sometimes you get lucky and trade 100ms latency for code
simplicity. Of course, the audio people don't mind anymore, now that
we have all sorts of realtime patches. Everyone's happy!
Jörn
--
He that composes himself is wiser than he that composes a book.
-- B. Franklin
On Wed, 29 Jun 2005, Jörn Engel wrote:
> On Wed, 29 June 2005 17:14:32 +0300, Denis Vlasenko wrote:
> >
> > This is why I always use _irqsave. Less error prone.
> > And locking is a very easy to get 'slightly' wrong, thus
> > I trade 0.1% of performance for code simplicity.
>
> But sometimes you get lucky and trade 100ms latency for code
> simplicity. Of course, the audio people don't mind anymore, now that
> we have all sorts of realtime patches. Everyone's happy!
>
God! If you are holding a spin_lock for 100ms, something is terribly
wrong, especialy since you better not schedule holding that spin_lock.
Spinlocks are _suppose_ to be for quick things. The difference in latency
between a *_lock and *_lock_irqsave only effects UP, on SMP both will give
the same latency, since another CPU might be busy spinning while waiting
for that lock, heck, on SMP the latency of *_lock can actually be higher,
since, as I already said, the other CPU will even have to wait while the
CPU that has the lock is servicing interrupts.
Although I must say that with all the realtime patches I'm happy :-)
-- Steve
And if not? GFP_NOFS and GFP_NOIO exist for a reason.
Regards
Oliver
All nice and well. But still, for the sake of simplicity and me not
wanting to think, I prefer always using spin_lock_irqsave for
everything. Since when should I stop and think about my own code?
In fact, why don't we all sit down and start using KCSP for kernel
hacking? ;)
Jörn
--
I can say that I spend most of my time fixing bugs even if I have lots
of new features to implement in mind, but I give bugs more priority.
-- Andrea Arcangeli, 2000
Ok, before even more people get confused - I was joking. Simple code
is obviously a good thing to have. Not thinking about code, well...
> > In fact, why don't we all sit down and start using KCSP for kernel
> > hacking? ;)
>
> Naw, I'm doing my PhD on implemting Linux drivers in SmallTalk. That will
> make everybody happy!
... but it appears as if you got the joke.
Jörn
--
Public Domain - Free as in Beer
General Public - Free as in Speech
BSD License - Free as in Enterprise
Shared Source - Free as in "Work will make you..."
On Wed, 29 Jun 2005, Jörn Engel wrote:
>
> All nice and well. But still, for the sake of simplicity and me not
> wanting to think, I prefer always using spin_lock_irqsave for
> everything. Since when should I stop and think about my own code?
OK, I use spin_lock_irqsave first, and then I only use spin_lock when I
already know interrupts are off. But the locks I usually use are used by
interrupts and that is reason enough to use it. I wouldn't use the
_irqsave for simplicity, I use it since I still believe it keeps latency
down for SMP.
>
> In fact, why don't we all sit down and start using KCSP for kernel
> hacking? ;)
>
Naw, I'm doing my PhD on implemting Linux drivers in SmallTalk. That will
make everybody happy!
-- Steve
> Ok, before even more people get confused - I was joking. Simple code
> is obviously a good thing to have. Not thinking about code, well...
Your first part seemed half joking and half serious, so I responded with a
more serious answer, just to make sure that others don't think I write
code like this:
spin_lock_irqsave(lock1,flags1);
spin_lock_irqsave(lock2,flags2);
spin_lock_irqsave(lock3,flags3);
Which would just be really silly! But you never know ;-)
>
> ... but it appears as if you got the joke.
>
What do you mean? My professor has shown me the light and LinuxSmallTalk
is the future.
-- Steve
> This is why I always use _irqsave. Less error prone.
No, it's just bad programming. How hard can it be to see which spinlocks are being used
by your ISR and which ones aren't? Only the ones that your ISR touches should have
_irqsave. It's really quite simple.
> This is more or less what I meant. Why think about each kmalloc and when you
> eventually did get it right: "Aha, we _sometimes_ get called from spinlocked code,
> GFP_ATOMIC then" - you still do atomic alloc even if cases when you
> were _not_ called from locked code! Thus you needed to think longer and got
> code which is worse.
So you're saying that you're the kind of programmer who makes more mistakes the longer you
think about something?????
Using GFP_ATOMIC increases the probability that you won't be able to allocate the memory
you need, and it also increases the probability that some other module that really needs
GFP_ATOMIC will also be unable to allocate the memory it needs. Please tell me, how is
this considered good programming?
--
Timur Tabi
Staff Software Engineer
timur...@ammasso.com
One thing a Southern boy will never say is,
"I don't think duct tape will fix it."
-- Ed Smylie, NASA engineer for Apollo 13
>It struck me that kernel actually can figure out whether it's okay
>to sleep or not by looking at combination of (flags & __GFP_WAIT)
>and ((in_atomic() || irqs_disabled()) as it already does this for
>might_sleep() barfing:
>
Wrong:
- the kernel cannot figure out if a thread owns a normal spinlock():
in_atomic detects spin_lock_bh(), irqs_disabled spin_lock_irq(). But
spin_lock has no global state.
- dito for get_cpu()/put_cpu users.
- dito for rcu users, or anyone else that uses preempt_disable() for
whatever purpose
--
Manfred
One question from Linux-Tag was about the lack of documentation about/in
the kernel. I try to maintain docbook entries when I modify code, even
though I think it's mostly wasted time: Virtually noone reads it anyway,
instead armchair logic on lkml.
Steven wrote:
>Here we see that task 2 can spin with interrupts off, while the first task
>is servicing an interrupt, and God forbid if the IRQ handler sends some
>kind of SMP signal to the CPU running task 2 since that would be a
>deadlock. Granted, this is a hypothetical situation, but makes using
>spin_lock with interrupts enabled a little scary.
>
>
Not, it's not even a hypothetical situation. It's an explicitely
forbidden situation: SMP signals are sent with smp_call_function and the
documentation to that function clearly says:
*
* You must not call this function with disabled interrupts or from a
* hardware interrupt handler or from a bottom half handler.
*/
>
> On Wed, 29 Jun 2005, Timur Tabi wrote:
>
>> Denis Vlasenko wrote:
>>
>>> This is why I always use _irqsave. Less error prone.
>>
>> No, it's just bad programming. How hard can it be to see which spinlocks are being used
>> by your ISR and which ones aren't? Only the ones that your ISR touches should have
>> _irqsave. It's really quite simple.
>
> What about my argument that spin_lock is actually a longer latency on an
> SMP system? That is, if you grab a spin_lock and task on another CPU
> tries to grab it and starts to spin. It will spin while the first task is
> servicing interrupts. It can be even worst with the following scenario:
>
> task 1:
> spin_lock(&non_irq_lock);
>
> task 2:
>
> spin_lock_irqsave(&some_irq_used_lock);
> spin_lock(&non_irq_lock);
>
> Here we see that task 2 can spin with interrupts off, while the first task
> is servicing an interrupt, and God forbid if the IRQ handler sends some
> kind of SMP signal to the CPU running task 2 since that would be a
> deadlock. Granted, this is a hypothetical situation, but makes using
> spin_lock with interrupts enabled a little scary.
>
But it wouldn't deadlock! It would just spin until the guy on
another CPU that had the lock unlocked.
FYI, spin_lock() is supposed to be used in an interrupt where it
is already known that the interrupts are OFF so you don't need
to save/restore flags because you know the condition. IFF the
ISR were to enable interrupts, with a spin-lock held (bad practice),
it still wouldn't deadlock, it's just that the entire system could
potentially degenerate into a poll-mode, spin in the ISR, mode
with awful performance.
>
>>> This is more or less what I meant. Why think about each kmalloc and when you
>>> eventually did get it right: "Aha, we _sometimes_ get called from spinlocked code,
>>> GFP_ATOMIC then" - you still do atomic alloc even if cases when you
>>> were _not_ called from locked code! Thus you needed to think longer and got
>>> code which is worse.
>>
>> So you're saying that you're the kind of programmer who makes more mistakes the longer you
>> think about something?????
>>
>> Using GFP_ATOMIC increases the probability that you won't be able to allocate the memory
>> you need, and it also increases the probability that some other module that really needs
>> GFP_ATOMIC will also be unable to allocate the memory it needs. Please tell me, how is
>> this considered good programming?
>
> I believe he was trying to say that there might be a function that is
> called by both an interrupt and non interrupt (schedulable) code. That
> means that the code would always need to do a GFP_ATOMIC and yes, it would
> mean that there's a higher chance that it would fail. So if you have some
> function that's used by lots of schedulable code and that same function is
> seldom used by an interrupt, then you either need to maintain two versions
> of the function (one with GFP_ATOMIC and one without) or always use
> GFP_ATOMIC which would mean the the majority user would suffer
> unsuccessful allocations more often.
>
> -- Steve
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Cheers,
Dick Johnson
Penguin : Linux version 2.6.12 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.
> >
> > task 1:
> > spin_lock(&non_irq_lock);
> >
> > task 2:
> >
> > spin_lock_irqsave(&some_irq_used_lock);
> > spin_lock(&non_irq_lock);
> >
> > Here we see that task 2 can spin with interrupts off, while the first task
> > is servicing an interrupt, and God forbid if the IRQ handler sends some
> > kind of SMP signal to the CPU running task 2 since that would be a
> > deadlock. Granted, this is a hypothetical situation, but makes using
> > spin_lock with interrupts enabled a little scary.
> >
>
> But it wouldn't deadlock! It would just spin until the guy on
> another CPU that had the lock unlocked.
>
Since Timur specified that spin_lock is used when you know that the lock
is not used in an interrupt, I'll continue as if that was the case.
This is a deadlock, because sending most SMP signals expect a reply when
it is received, and will wait until it gets it. Here's a more detailed
version.
CPU 1: running task 1
spin_lock(&non_irq_lock);
on CPU 2: running task 2
spin_lock_irqsave(&some_irq_used_lock); /* interrupts now off */
spin_lock(&non_irq_lock); /* blocked and spinning waiting for task 1*/
back on CPU 1:
interrupt goes off a preempts task 1 before it releases the
non_irq_lock.
Calls some stupid ISR that needs to send a signal to the other CPU.
Sends signal and waits for a reply. (for example the flush_tlb_others)
Now we have a deadlock. The interrupt on CPU 1 is waiting for a response
after sending an IPI to CPU 2, but CPU 2 is stuck spinning with interrupts
disabled and wont ever respond because the lock it is waiting for is held
by task 1 on CPU 1 that was preempted by the interrupt that will never
return.
This is probably the reason it is not allowed to call most IPIs from
interrupt or bottom half context.
> FYI, spin_lock() is supposed to be used in an interrupt where it
> is already known that the interrupts are OFF so you don't need
> to save/restore flags because you know the condition. IFF the
> ISR were to enable interrupts, with a spin-lock held (bad practice),
> it still wouldn't deadlock, it's just that the entire system could
> potentially degenerate into a poll-mode, spin in the ISR, mode
> with awful performance.
I stated earlier that the only times I use spin_lock is when I already
know that interrupts are off, and that includes ISRs.
There are cases where using spin_lock instead of _irqsave version is a
matter of correctness. For example, the page table lock beeing always
taking without _irq is important to let the IPIs flow.
Ben.
On Wed, 29 Jun 2005, Manfred Spraul wrote:
> Hi,
>
> One question from Linux-Tag was about the lack of documentation about/in
> the kernel. I try to maintain docbook entries when I modify code, even
> though I think it's mostly wasted time: Virtually noone reads it anyway,
> instead armchair logic on lkml.
Hmm, I do like to read the comments, see below.
>
> Steven wrote:
>
> >Here we see that task 2 can spin with interrupts off, while the first task
> >is servicing an interrupt, and God forbid if the IRQ handler sends some
> >kind of SMP signal to the CPU running task 2 since that would be a
> >deadlock. Granted, this is a hypothetical situation, but makes using
> >spin_lock with interrupts enabled a little scary.
> >
> >
> Not, it's not even a hypothetical situation. It's an explicitely
> forbidden situation: SMP signals are sent with smp_call_function and the
> documentation to that function clearly says:
When I said _hypothetical_ I ment it. That's basically stating that the
situation wont happen, but lets pretend that it will. And no, SMP signals
(on intel anyway) are sent with send_IPI_* which even smp_call_function
uses.
> *
> * You must not call this function with disabled interrupts or from a
> * hardware interrupt handler or from a bottom half handler.
> */
>
And if you had read my other emails you would have noticed that I
even mentioned this particular comment. When I said:
"This is probably the reason it is not allowed to call most IPIs from
interrupt or bottom half context."
Also, a comment doesn't force this, and there's no test in
smp_call_function that prevents a user from calling this form a
bottom_half!
-- Steve
On Thu, 30 Jun 2005, Benjamin Herrenschmidt wrote:
>
> There are cases where using spin_lock instead of _irqsave version is a
> matter of correctness. For example, the page table lock beeing always
> taking without _irq is important to let the IPIs flow.
>
There's always exceptions! ;-)
As Jörn mentioned, you don't just use spin_lock_irqsave just to keep from
thinking, which I totally agree, but most of the time I use spin_locks, it
is better to not let interrupts flow, and for this reason, I try to keep
the places that use spin_locks as short as possible, since I don't want
100ms latencies.
-- Steve
Given that I do not touch core kernel and most of spinlocks I ever
did were in drivers - yes, they are there to protect me from IRQ
handler races.
> > This is more or less what I meant. Why think about each kmalloc and when you
> > eventually did get it right: "Aha, we _sometimes_ get called from spinlocked code,
> > GFP_ATOMIC then" - you still do atomic alloc even if cases when you
> > were _not_ called from locked code! Thus you needed to think longer and got
> > code which is worse.
>
> So you're saying that you're the kind of programmer who makes more mistakes the longer you
> think about something?????
No.
I say that writing kmalloc(size, GFP_ATOMIC) takes more time to verify
that it is needed compared to hypothetical kmalloc_auto(size),
and yet kmalloc(size, GFP_ATOMIC) is worse in a sense thet it won't sleep
even if it happens to be called outside locks.
Think about this:
/* may be called under spinlock */
void do_something() {
/* we need to alloc here */
}
> Using GFP_ATOMIC increases the probability that you won't be able to allocate the memory
> you need, and it also increases the probability that some other module that really needs
> GFP_ATOMIC will also be unable to allocate the memory it needs. Please tell me, how is
> this considered good programming?
Where did I say "let's use GFP_ATOMIC everywhere" ?
--
vda
On Thu, 30 Jun 2005, Denis Vlasenko wrote:
>
> Think about this:
>
> /* may be called under spinlock */
> void do_something() {
> /* we need to alloc here */
> }
>
Well thinking about this :-) The way that this needs to be implemented is
by passing to do_something the flags to use in kmalloc, which is done in
the kernel when needing to call something that allocates when different
flags can be used. So a kmalloc_auto would only save on passing flags in
parameters, which can sometimes get annoying.
-- Steve