Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Differential] [Changed Subscribers] D1986: Teach lagg(4) to change MTU

2 views
Skip to first unread message

rstone (Ryan Stone)

unread,
Feb 28, 2015, 5:20:19 PM2/28/15
to
rstone added a subscriber: freebsd-net.

REVISION DETAIL
https://reviews.freebsd.org/D1986

To: rpokala-panasas.com, rstone
Cc: freebsd-net
_______________________________________________
freeb...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net...@freebsd.org"

ae (Andrey V. Elsukov)

unread,
Mar 1, 2015, 7:50:26 AM3/1/15
to
ae added a subscriber: ae.
ae added a comment.

Just a thought. Imagine two interfaces, one has maximum MTU 2200, another 1500. lagg0 has MTU 1400.
Two threads invokes changing MTU in the same time. One wants to change it to 2000, another - to 1500.
It is possible, that when both threads will finish its job, one interface will have MTU 1400, but another - 1500.
I mean, that such changes should be done exclusively without possibility of races in the ioctl code.
Cc: ae, freebsd-net

rpokala-panasas.com (Ravi Pokala)

unread,
Mar 1, 2015, 4:50:16 PM3/1/15
to
rpokala-panasas.com added a comment.

>>! In D1986#4, @ae wrote:
> Just a thought. Imagine two interfaces, one has maximum MTU 2200, another 1500. lagg0 has MTU 1400.
> Two threads invokes changing MTU in the same time. One wants to change it to 2000, another - to 1500.
> It is possible, that when both threads will finish its job, one interface will have MTU 1400, but another - 1500.
> I mean, that such changes should be done exclusively without possibility of races in the ioctl code.

Don't the calls to LAGG_RLOCK() / LAGG_RUNLOCK() prevent that by enforcing serialization?

rstone (Ryan Stone)

unread,
Mar 1, 2015, 6:57:31 PM3/1/15
to
rstone added a comment.

RLOCK only gets a read lock. You want WLOCK to get a write lock to ensure serialization.

rstone (Ryan Stone)

unread,
Mar 1, 2015, 7:02:17 PM3/1/15
to
rstone added inline comments.

INLINE COMMENTS
sys/net/if_lagg.c:1772 style(9) says to not include unnecessary braces (which I personally disagree with, but what can you do?)
sys/net/if_lagg.c:1773 style(9): put brackets around the return value:

return (0);
sys/net/if_lagg.c:1811 I find the flow control here a bit confusing (my first read through, I thought that err2 could be used unitinialized). Given that you have a continue in the if block, I would find it clearer to not have an else here

emaste (Ed Maste)

unread,
Mar 1, 2015, 11:38:36 PM3/1/15
to
emaste added a subscriber: emaste.
Cc: emaste, ae, freebsd-net

ae (Andrey V. Elsukov)

unread,
Mar 2, 2015, 2:38:44 AM3/2/15
to
ae added a comment.

>>! In D1986#7, @rstone wrote:
> RLOCK only gets a read lock. You want WLOCK to get a write lock to ensure serialization.

Also we can use another lock in the lagg_ioctl, that will prevent simultaneous MTU changing.

rpokala-panasas.com (Ravi Pokala)

unread,
Mar 2, 2015, 8:29:06 PM3/2/15
to
rpokala-panasas.com updated this revision to Diff 4079.
rpokala-panasas.com added a comment.

# Updating D1986: Teach lagg(4) to change MTU
Addressing review comments from rstone:

Fix style(9) violations
Replace LAGG_RLOCK() w/ LAGG_WLOCK()
Clarify logic for roll-back
Better comments throughout.

CHANGES SINCE LAST UPDATE
https://reviews.freebsd.org/D1986?vs=4026&id=4079

BRANCH
/head

REVISION DETAIL
https://reviews.freebsd.org/D1986

AFFECTED FILES
sys/net/if_lagg.c

rpokala-panasas.com (Ravi Pokala)

unread,
Mar 2, 2015, 8:40:29 PM3/2/15
to
rpokala-panasas.com added a comment.

4079 addresses @rstone's comments. However, the LOR remains:

```
Mar 2 17:00:43 fbsd-X root: ifconfig lagg0 mtu 9000
Mar 2 17:00:43 fbsd-X kernel: lock order reversal:
Mar 2 17:00:43 fbsd-X kernel: 1st 0xfffff80004960c08 if_lagg rmlock (if_lagg rmlock) @ /usr/src/sys/modules/if_lagg/../../net/if_lagg.c:1779
Mar 2 17:00:43 fbsd-X kernel: 2nd 0xfffffe0000d50748 em1 (EM Core Lock) @ /usr/src/sys/dev/e1000/if_lem.c:1035
Mar 2 17:00:43 fbsd-X kernel: KDB: stack backtrace:
Mar 2 17:00:43 fbsd-X kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003f86c0
Mar 2 17:00:43 fbsd-X kernel: witness_checkorder() at witness_checkorder+0xe45/frame 0xfffffe00003f8750
Mar 2 17:00:43 fbsd-X kernel: __mtx_lock_flags() at __mtx_lock_flags+0xa8/frame 0xfffffe00003f87a0
Mar 2 17:00:43 fbsd-X kernel: lem_ioctl() at lem_ioctl+0x3e8/frame 0xfffffe00003f87e0
Mar 2 17:00:43 fbsd-X kernel: lagg_ioctl() at lagg_ioctl+0xafd/frame 0xfffffe00003f88e0
Mar 2 17:00:43 fbsd-X kernel: ifioctl() at ifioctl+0x102e/frame 0xfffffe00003f89a0
Mar 2 17:00:43 fbsd-X kernel: kern_ioctl() at kern_ioctl+0x2c0/frame 0xfffffe00003f8a00
Mar 2 17:00:43 fbsd-X kernel: sys_ioctl() at sys_ioctl+0x153/frame 0xfffffe00003f8ae0
Mar 2 17:00:43 fbsd-X kernel: amd64_syscall() at amd64_syscall+0x27f/frame 0xfffffe00003f8bf0
Mar 2 17:00:43 fbsd-X kernel: Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00003f8bf0
Mar 2 17:00:43 fbsd-X kernel: --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8011e1eea, rsp = 0x7fffffffe288, rbp = 0x7fffffffe2a0 ---
```

REVISION DETAIL
https://reviews.freebsd.org/D1986

rpokala-panasas.com (Ravi Pokala)

unread,
Mar 5, 2015, 5:19:53 PM3/5/15
to
rpokala-panasas.com added a comment.

One of my colleagues dug into the first LOR - the one that happens when adding em1 to lagg0, which happens both with and without the patch. It looks like this is a known issue:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194109

lakshmi.n_msystechnologies.com (LN)

unread,
Oct 1, 2015, 3:23:43 AM10/1/15
to
lakshmi.n_msystechnologies.com added a subscriber: lakshmi.n_msystechnologies.com.
lakshmi.n_msystechnologies.com added a comment.

We (Panasas) tried reproducing the problem by applying the patch and enabled WITNESS in freebsd stable (10.1) kernel.
While testing with this patched kernel, LOR not observed on ixl and igb ports.

Based on the LOR logs mentioned earlier, the issue seems to be observed on device specific driver like em (e1000).
We tried doing code analysis on em driver and could not find anything obvious. Our platform do not use em driver anymore.

The issue seems to be on device specific driver and not the lagg driver changes that we introduced.
So, we kindly request you to accept these changes to mainline.


REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala-panasas.com, rstone
Cc: lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list
_______________________________________________
freeb...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net

sbruno (Sean Bruno)

unread,
Oct 1, 2015, 11:44:45 AM10/1/15
to
sbruno added a subscriber: sbruno.
sbruno added a comment.

In https://reviews.freebsd.org/D1986#77466, @lakshmi.n_msystechnologies.com wrote:

> We (Panasas) tried reproducing the problem by applying the patch and enabled WITNESS in freebsd stable (10.1) kernel.
> While testing with this patched kernel, LOR not observed on ixl and igb ports.
>
> Based on the LOR logs mentioned earlier, the issue seems to be observed on device specific driver like em (e1000).
> We tried doing code analysis on em driver and could not find anything obvious. Our platform do not use em driver anymore.
>
> The issue seems to be on device specific driver and not the lagg driver changes that we introduced.
> So, we kindly request you to accept these changes to mainline.


lem_ioctl() at lem_ioctl+0x3e8/frame 0xfffffe00003f87e0

This actually means sys/dev/e1000/if_lem.c ... I'll take a look at the LOR and see if its fixable.


REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala-panasas.com, rstone
Cc: sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

lakshmi.n_msystechnologies.com (LN)

unread,
Oct 19, 2015, 1:59:50 AM10/19/15
to
lakshmi.n_msystechnologies.com added a comment.

@sbruno, A gentle remainder on em driver's LOR issue. Can you please share your findings.

hrs (Hiroki Sato)

unread,
Oct 19, 2015, 2:21:31 AM10/19/15
to
hrs added a subscriber: hrs.
hrs added a comment.

It is true that this LOR is driver-specific but calling SIOCSIFMTU after acquiring a lock in lagg ioctl is not always safe. Change of lladdr suffers from the same situation and it was solved by using an asynchronous task queue to update addresses on each port. What do you think about piggybacking an MTU change to the queue by extending struct lagg_llq to a more generic one which makes it possible to handle per-port properties?


REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala-panasas.com, rstone
Cc: hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

melifaro (Alexander V. Chernikov)

unread,
Oct 27, 2015, 6:38:36 AM10/27/15
to
melifaro added a subscriber: melifaro.
melifaro added a comment.

After digging into lagg internals on updating lladdrs on lagg ports, I'd also vote for extenging llq to deal with MTU changes for underlying interfaces


REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala-panasas.com, rstone
Cc: melifaro, hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

lakshmi.n_msystechnologies.com (LN)

unread,
Oct 28, 2015, 3:25:23 AM10/28/15
to
lakshmi.n_msystechnologies.com added a comment.

Thanks @hrs and @melifaro for the suggestions.

I am working on the code changes to handle the MTU asynchronously. I will update the tested patch for review by this week.

rpokala (Ravi Pokala)

unread,
Dec 14, 2015, 2:48:30 PM12/14/15
to
rpokala commandeered this revision.
rpokala added a reviewer: rpokala-panasas.com.
Herald added a subscriber: imp.

REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala, rstone, rpokala-panasas.com
Cc: imp, melifaro, hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

rpokala (Ravi Pokala)

unread,
Dec 15, 2015, 10:37:43 AM12/15/15
to
rpokala updated this revision to Diff 11321.
rpokala added a comment.


Updated patch from LN, addressing comments from hrs@, melifaro@, and internal Panasas review.

REPOSITORY
rS FreeBSD src repository

CHANGES SINCE LAST UPDATE
https://reviews.freebsd.org/D1986?vs=4079&id=11321

BRANCH
/head

REVISION DETAIL
https://reviews.freebsd.org/D1986

AFFECTED FILES
sys/net/if_lagg.c
sys/net/if_lagg.h
D1986.11321.patch

rpokala (Ravi Pokala)

unread,
Dec 15, 2015, 10:42:01 AM12/15/15
to
rpokala marked 3 inline comments as done.

REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala, rstone, rpokala-panasas.com
Cc: imp, melifaro, hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

smh (Steven Hartland)

unread,
Dec 17, 2015, 4:53:23 PM12/17/15
to
smh added a subscriber: smh.
smh added a comment.


Some general style nits and a question re-loss of the mtu set error.

INLINE COMMENTS
sys/net/if_lagg.c:706 style(9) bool use of pointer type.
sys/net/if_lagg.c:731 style(9) four space additional indent only should be used, more below.
sys/net/if_lagg.c:769 Looks like this attempts to change more that would have been done above before the error, is that intended?
sys/net/if_lagg.c:783 Looks like the error gets lost, although printed, is there no way we can avoid this?
sys/net/if_lagg.c:812 style(9) bracing around return.
sys/net/if_lagg.c:818 style(9) init of vars in declaration should be avoided.

Moving to down to where its first needed can avoid setting it at all.
sys/net/if_lagg.c:837 style(9) bool use of pointer type.
sys/net/if_lagg.c:843 style(9) bool use of pointer type.
sys/net/if_lagg.c:852 consider: "it might have been updated."
sys/net/if_lagg.c:861 style(9) bool use of pointer type.

REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala, rstone, rpokala-panasas.com
Cc: smh, imp, melifaro, hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list

melifaro (Alexander V. Chernikov)

unread,
Dec 17, 2015, 5:16:25 PM12/17/15
to
melifaro added inline comments.

INLINE COMMENTS
sys/net/if_lagg.c:728 What if we have multiple events queued on tasq? e.g mtu AND mac change
sys/net/if_lagg.c:763 Not that easy, unfortunately.
At this moment original ioctl returned 0, so other things/events were fired:
rtsock notification, IPv6 nd mtg update, route table mtg update.. see net/if.c for more details.
Trying to silently revert part of it without dealing with other things is not a good idea.

hrs (Hiroki Sato)

unread,
Dec 18, 2015, 5:11:40 PM12/18/15
to
hrs added inline comments.

INLINE COMMENTS
sys/net/if_lagg.c:753 Please separate a llq loop from a handler for per-port configuration. A llq traversal should be required only once in lagg_port_ops() if the handlers process a single lagg_llq entry.
sys/net/if_lagg.c:837 Is this (llq == NULL), not (llq != NULL)?
sys/net/if_lagg.c:840 Why is cleanup required here? This removes all of tasks not limited to MTU change.
sys/net/if_lagg.c:861 free(NULL) does nothing. Checking if NULL or not is useless.
sys/net/if_lagg.c:872 This traversal and freeing an entry after processing it should be done in lagg_port_ops().
sys/net/if_lagg.h:220 Please add "llq_" prefix to the members.
sys/net/if_lagg.h:221 Is there any reason to have ifr as a pointer? malloc is generally expensive in kernel, and overhead of struct ifreq is acceptable for me even if every llq has one. I feel this complicates the error handling at least.

lakshmi.n_msystechnologies.com (LN)

unread,
Jan 6, 2016, 10:12:51 AM1/6/16
to
lakshmi.n_msystechnologies.com added a comment.


Thanks @smh, @melifaro and @hrs for the review comments.

rpokala (Ravi Pokala)

unread,
Jan 6, 2016, 10:10:19 PM1/6/16
to
rpokala updated this revision to Diff 11984.
rpokala marked 8 inline comments as done.
rpokala added a comment.


Address review comments from smh@, melifaro@, and hrs@.

REPOSITORY
rS FreeBSD src repository

CHANGES SINCE LAST UPDATE
https://reviews.freebsd.org/D1986?vs=11321&id=11984

BRANCH
/head

REVISION DETAIL
https://reviews.freebsd.org/D1986

AFFECTED FILES
sys/kern/kern_condvar.c
sys/net/if_lagg.c
sys/net/if_lagg.h
sys/sys/condvar.h
D1986.11984.patch

rpokala (Ravi Pokala)

unread,
Aug 1, 2016, 4:17:50 PM8/1/16
to
rpokala abandoned this revision.
rpokala added a comment.


This turned out to be way more complicated than anyone expected, the people who were working on it for Panasas are no longer with the company, the changes have bitrotted due to six months of neglect from me, and we came up with a reasonable workaround in the scripting layer. For all those reasons, I'm abandoning this diff.

REPOSITORY
rS FreeBSD src repository

REVISION DETAIL
https://reviews.freebsd.org/D1986

EMAIL PREFERENCES
https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rpokala, rstone, rpokala-panasas.com, smh, melifaro
Cc: rpokala-panasas.com, smh, imp, melifaro, hrs, sbruno, lakshmi.n_msystechnologies.com, emaste, ae, freebsd-net-list
0 new messages