[PATCH] fanotify: to differ file access event from different threads

boyd yang

unread,

Oct 10, 2011, 2:50:01 AM10/10/11

to

This patch fixes a hang problem of Eric Paris's fs Notification/fanotify.

Fanotify brings a way to intercept file access events.
When multiple threadsiterate the same direcotry, some thread will hang.
This patch let fanotify to differ access events from different
threads, prevent fanotify from merging access events from different
threads.

-------------------------------

diff -r -u linux-3.1-rc4_orig/fs/notify/fanotify/fanotify.c
linux-3.1-rc4/fs/notify/fanotify/fanotify.c
--- linux-3.1-rc4_orig/fs/notify/fanotify/fanotify.c 2011-08-29
12:16:01.000000000 +0800
+++ linux-3.1-rc4/fs/notify/fanotify/fanotify.c 2011-10-10
12:28:23.276847000 +0800
@@ -15,7 +15,8 @@

if (old->to_tell == new->to_tell &&
old->data_type == new->data_type &&
- old->tgid == new->tgid) {
+ old->tgid == new->tgid &&
+ old->pid == new->pid) {
switch (old->data_type) {
case (FSNOTIFY_EVENT_PATH):
if ((old->path.mnt == new->path.mnt) &&
@@ -144,11 +145,19 @@
return PTR_ERR(notify_event);

#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
- if (event->mask & FAN_ALL_PERM_EVENTS) {
- /* if we merged we need to wait on the new event */
- if (notify_event)
- event = notify_event;
- ret = fanotify_get_response_from_access(group, event);
+ //if overflow, do not wait for response
+ if(fsnotify_isoverflow(event))
+ {
+ pr_debug("fanotify overflow!\n");
+ }
+ else
+ {
+ if (event->mask & FAN_ALL_PERM_EVENTS) {
+ /* if we merged we need to wait on the new event */
+ if (notify_event)
+ event = notify_event;
+ ret = fanotify_get_response_from_access(group, event);
+ }
}
#endif

diff -r -u linux-3.1-rc4_orig/fs/notify/notification.c
linux-3.1-rc4/fs/notify/notification.c
--- linux-3.1-rc4_orig/fs/notify/notification.c 2011-08-29
12:16:01.000000000 +0800
+++ linux-3.1-rc4/fs/notify/notification.c 2011-10-10 12:27:09.331787000 +0800
@@ -95,6 +95,7 @@
BUG_ON(!list_empty(&event->private_data_list));

kfree(event->file_name);
+ put_pid(event->pid);
put_pid(event->tgid);
kmem_cache_free(fsnotify_event_cachep, event);
}
@@ -132,6 +133,14 @@
return priv;
}

+bool fsnotify_isoverflow(struct fsnotify_event *event)
+{
+ if(event==q_overflow_event)
+ {
+ return true;
+ }
+ return false;
+}
/*
* Add an event to the group notification queue. The group can later pull this
* event off the queue to deal with. If the event is successfully added to the
@@ -374,6 +383,7 @@
return NULL;
}
}
+ event->pid = get_pid(old_event->pid);
event->tgid = get_pid(old_event->tgid);
if (event->data_type == FSNOTIFY_EVENT_PATH)
path_get(&event->path);
@@ -417,6 +427,7 @@
event->name_len = strlen(event->file_name);
}

+ event->pid = get_pid(task_pid(current));
event->tgid = get_pid(task_tgid(current));
event->sync_cookie = cookie;
event->to_tell = to_tell;
diff -r -u linux-3.1-rc4_orig/include/linux/fsnotify_backend.h
linux-3.1-rc4/include/linux/fsnotify_backend.h
--- linux-3.1-rc4_orig/include/linux/fsnotify_backend.h 2011-08-29
12:16:01.000000000 +0800
+++ linux-3.1-rc4/include/linux/fsnotify_backend.h 2011-10-10
12:27:48.587369000 +0800
@@ -238,6 +238,7 @@
u32 sync_cookie; /* used to corrolate events, namely inotify mv events */
const unsigned char *file_name;
size_t name_len;
+ struct pid *pid;
struct pid *tgid;

#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
@@ -378,6 +379,8 @@
struct fsnotify_event_private_data *priv,
struct fsnotify_event *(*merge)(struct list_head *,
struct fsnotify_event *));
+/*true if the event is an overflow event*/
+extern bool fsnotify_isoverflow(struct fsnotify_event *event);
/* true if the group notification queue is empty */
extern bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group);
/* return, but do not dequeue the first event on the notification queue */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

boyd yang

unread,

Oct 13, 2011, 9:30:02 AM10/13/11

to

Does anybody review my patch and integrate into kernel?

On Thu, Oct 13, 2011 at 4:56 PM, boyd yang <boyd...@gmail.com> wrote:
> This patch fixes a hang problem of Eric Paris's fs Notification/fanotify.
>
> Fanotify brings a way to intercept file access events.
> When multiple threadsiterate the same direcotry, some thread will hang.
> This patch let fanotify to differ access events from different
> threads, prevent fanotify from merging access events from different
> threads.
>

> =============================================================

Américo Wang

unread,

Oct 13, 2011, 10:40:01 AM10/13/11

to

On Thu, Oct 13, 2011 at 9:27 PM, boyd yang <boyd...@gmail.com> wrote:
> Does anybody review my patch and integrate into kernel?
>
>

You need to adjust your coding style to Linux kernel's,
see Documentation/CodingStyle.

boyd yang

unread,

Oct 14, 2011, 3:00:02 AM10/14/11

to

Thanks, I updated the patch.
Now it passed checkpatch.pl.

I used the flag you mentioned.

fanotify: to differ file access event from different threads
When fanotify is monitoring the whole mount point "/", and multiple
threads iterate the same direcotry, some thread will hang.

This patch let fanotify to differ access events from different
threads, prevent fanotify from merging access events from different
threads.

It also hide overflow events to reach user space.
Signed-off-by: Boyd Yang <boyd...@gmail.com>

diff -r -u linux-3.1-rc4_orig/fs/notify/fanotify/fanotify.c
linux-3.1-rc4/fs/notify/fanotify/fanotify.c
--- linux-3.1-rc4_orig/fs/notify/fanotify/fanotify.c 2011-08-29
12:16:01.000000000 +0800

+++ linux-3.1-rc4/fs/notify/fanotify/fanotify.c 2011-10-14
14:17:53.055958000 +0800
@@ -15,7 +15,8 @@

if (old->to_tell == new->to_tell &&
old->data_type == new->data_type &&
- old->tgid == new->tgid) {
+ old->tgid == new->tgid &&
+ old->pid == new->pid) {
switch (old->data_type) {
case (FSNOTIFY_EVENT_PATH):
if ((old->path.mnt == new->path.mnt) &&

@@ -144,11 +145,16 @@
return PTR_ERR(notify_event);

#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
- if (event->mask & FAN_ALL_PERM_EVENTS) {
- /* if we merged we need to wait on the new event */
- if (notify_event)
- event = notify_event;
- ret = fanotify_get_response_from_access(group, event);

+ /*if overflow, do not wait for response*/
+ if (event->mask&FS_Q_OVERFLOW) {

+ pr_debug("fanotify overflow!\n");

+ } else {

+ if (event->mask & FAN_ALL_PERM_EVENTS) {
+ /* if we merged we need to wait on the new event */
+ if (notify_event)
+ event = notify_event;
+ ret = fanotify_get_response_from_access(group, event);
+ }
}
#endif

diff -r -u linux-3.1-rc4_orig/fs/notify/notification.c
linux-3.1-rc4/fs/notify/notification.c
--- linux-3.1-rc4_orig/fs/notify/notification.c 2011-08-29
12:16:01.000000000 +0800

+++ linux-3.1-rc4/fs/notify/notification.c 2011-10-14 13:52:36.946608000 +0800

@@ -95,6 +95,7 @@
BUG_ON(!list_empty(&event->private_data_list));

kfree(event->file_name);
+ put_pid(event->pid);
put_pid(event->tgid);
kmem_cache_free(fsnotify_event_cachep, event);
}

@@ -374,6 +375,7 @@

return NULL;
}
}
+ event->pid = get_pid(old_event->pid);
event->tgid = get_pid(old_event->tgid);
if (event->data_type == FSNOTIFY_EVENT_PATH)
path_get(&event->path);

@@ -417,6 +419,7 @@

event->name_len = strlen(event->file_name);
}

+ event->pid = get_pid(task_pid(current));
event->tgid = get_pid(task_tgid(current));
event->sync_cookie = cookie;
event->to_tell = to_tell;
diff -r -u linux-3.1-rc4_orig/include/linux/fsnotify_backend.h
linux-3.1-rc4/include/linux/fsnotify_backend.h
--- linux-3.1-rc4_orig/include/linux/fsnotify_backend.h 2011-08-29
12:16:01.000000000 +0800

+++ linux-3.1-rc4/include/linux/fsnotify_backend.h 2011-10-14
13:51:50.380168000 +0800

@@ -238,6 +238,7 @@
u32 sync_cookie; /* used to corrolate events, namely inotify mv events */
const unsigned char *file_name;
size_t name_len;
+ struct pid *pid;
struct pid *tgid;

#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS

On Thu, Oct 13, 2011 at 9:38 PM, Josef Bacik <jo...@redhat.com> wrote:

> On Thu, Oct 13, 2011 at 04:56:43PM +0800, boyd yang wrote:
>> This patch fixes a hang problem of Eric Paris's fs Notification/fanotify.
>>
>> Fanotify brings a way to intercept file access events.
>> When multiple threadsiterate the same direcotry, some thread will hang.
>> This patch let fanotify to differ access events from different
>> threads, prevent fanotify from merging access events from different
>> threads.
>>
>

> You need to run this through checkpatch.pl, you have a ton of formatting
> problems. �Also your email client seems to have word-wrapped parts of this, so
> use a email client that doesn't word wrap.

> The overflow event should only have FS_Q_OVERFLOW set in it's mask right? �So
> why is this test needed at all? �Thanks,
>
> Josef

boyd yang

unread,

Nov 17, 2011, 5:30:02 AM11/17/11

to

How can this patch get merged?
Who is responsible for this?

We are developing some real-time antivirus software, the fanotify can make
it work without dirver/kenel-modues( like RedirFS).
But the bug costs a lot trouble.
If we used the patch, our software works great!

I think other antivirus or file-motoring softwares are influnced by the bug
too.

Jan Kara

unread,

Nov 24, 2011, 10:00:01 AM11/24/11

to

On Thu 17-11-11 18:21:26, boyd yang wrote:
> How can this patch get merged?
> Who is responsible for this?

Eric Paris is responsible for merging this patch. I see you have CCed him
so that should be OK. Can you maybe resend the patch in a separate email
again? I see your last submission was included in a reply to email together
with other text etc. which makes merging it using our standard tools
harder...

Honza

> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in

> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Jan Kara <ja...@suse.cz>
SUSE Labs, CR

boyd yang

unread,

Dec 5, 2011, 8:30:01 PM12/5/11

to

Thanks Jan Karan!
I once thought it was buried.
Now I have resent the patch.

boyd yang

unread,

Dec 5, 2011, 8:30:02 PM12/5/11

to

Lino Sanfilippo

unread,

Dec 10, 2011, 11:30:02 AM12/10/11

to

On Tue, Dec 06, 2011 at 09:23:25AM +0800, boyd yang wrote:
> fanotify: to differ file access event from different threads
> When fanotify is monitoring the whole mount point "/", and multiple
> threads iterate the same direcotry, some thread will hang.
> This patch let fanotify to differ access events from different
> threads, prevent fanotify from merging access events from different
> threads.

I could reproduce the problem you described. Actually it may occur each time
when more than one thread of a process is waiting in fanotify_get_response()
at the same time. This is since we are not prepared to wake up more than one
waiter: We do

wait_event(group->fanotify_data.access_waitq, event->response ||
atomic_read(&group->fanotify_data.bypass_perm));

and after that
event->response = 0;

which is the reason that even if we woke up other waiters on the same waitqueue
they may see event->response being already 0, go back to sleep and then possibly
hang forever.

> It also hide overflow events to reach user space.

This is not part of this patch and also should not be, since overflow
events are supposted to reach user space.

> - if (event->mask & FAN_ALL_PERM_EVENTS) {
> - /* if we merged we need to wait on the new event */
> - if (notify_event)
> - event = notify_event;
> - ret = fanotify_get_response_from_access(group, event);
> + /*if overflow, do not wait for response*/
> + if (event->mask&FS_Q_OVERFLOW) {
> + pr_debug("fanotify overflow!\n");
> + } else {
> + if (event->mask & FAN_ALL_PERM_EVENTS) {
> + /* if we merged we need to wait on the new event */
> + if (notify_event)
> + event = notify_event;
> + ret = fanotify_get_response_from_access(group, event);
> + }
> }

What is this for? All you do is introduce a debug message for no real reason.
However your fix (avoid merging of events from the same process) seems to work.

Regards,
Lino

Mihai Donțu

unread,

Feb 11, 2013, 3:00:03 PM2/11/13

to

This patch triggers the following on my 3.7.6 kernel:

INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0,
t=15002 jiffies) sending NMI to all CPUs:
NMI backtrace for cpu 0
CPU 0
Modules linked in: ext2 ppdev parport_pc mac_hid psmouse serio_raw
i2c_piix4 lp parport 8139too floppy 8139cp

Pid: 0, comm: swapper/0 Not tainted 3.2.35 #12 Bochs Bochs
RIP: 0010:[<ffffffff81037bdf>] [<ffffffff81037bdf>]
flat_send_IPI_all+0xaf/0xd0 RSP: 0018:ffff88003fc03d88 EFLAGS: 00010006
RAX: 0000000000000000 RBX: 0000000000000046 RCX: 0000000000000050
RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000300
RBP: ffff88003fc03da8 R08: 000000000000000a R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000c00
R13: 0000000003000000 R14: 0000000000000001 R15: ffffffff81c32d00
FS: 0000000000000000(0000) GS:ffff88003fc00000(0000)
knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fb5d3100000 CR3: 000000001ca06000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task
ffffffff81c0d020) Stack:
0000000000000000 0000000000002710 ffffffff81c31c00 ffffffff81c31d00
ffff88003fc03dc8 ffffffff81033231 000000000000000a ffff88003fc0ec40
ffff88003fc03e18 ffffffff810defbe ffff880000000001 ffffffff81c32d00
Call Trace:
<IRQ>
[<ffffffff81033231>] arch_trigger_all_cpu_backtrace+0x61/0xa0
[<ffffffff810defbe>] __rcu_pending+0x3ae/0x420
[<ffffffff810df329>] rcu_check_callbacks+0x79/0x1e0
[<ffffffff81078068>] update_process_times+0x48/0x90
[<ffffffff8109b4f4>] tick_sched_timer+0x64/0xc0
[<ffffffff8108df56>] __run_hrtimer+0x76/0x1f0
[<ffffffff8109b490>] ? tick_nohz_handler+0x100/0x100
[<ffffffff8108e907>] hrtimer_interrupt+0xf7/0x230
[<ffffffff81650409>] smp_apic_timer_interrupt+0x69/0x99
[<ffffffff8164e2de>] apic_timer_interrupt+0x6e/0x80
<EOI>
[<ffffffff810904a5>] ? sched_clock_local+0x25/0x90
[<ffffffff8103cedb>] ? native_safe_halt+0xb/0x10
[<ffffffff8101c6b3>] default_idle+0x53/0x1d0
[<ffffffff81013236>] cpu_idle+0xd6/0x120
[<ffffffff816172ce>] rest_init+0x72/0x74
[<ffffffff81cfcba5>] start_kernel+0x3b0/0x3bd
[<ffffffff81cfc347>] x86_64_start_reservations+0x132/0x136
[<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
[<ffffffff81cfc44d>] x86_64_start_kernel+0x102/0x111
[...]

It happens after my application runs for half an hour or so. However, I
don't see how this could possibly solve the problem I've observed: due
to a race, a kernel thread ends up doing wait_event() on an event which
soon after is merged by a different thread into a new one which becomes
the actual event to be "received" by the content introspection
application. It's easily reproducible with a simple script:

$ while true; do cp -f /root/eicar.com /root/watched-dir; done

all the while the fanotify application does a re-open (RD -> RDWR) and
truncate(0), on multiple threads.

(I do a fanotify_init(O_RDONLY) because of surprise ETXTBSY)

Anyway, regardless of how I use the API the race needs to be
eliminated somehow. So my problem now is: how do I switch all
wait_event()-users to the new event created by fanotify_merge()?

--
Mihai Donțu