On Sun, Nov 30, 2025 at 08:09:48PM +0100, Oliver Hartkopp wrote:
Hello Oliver,
I tried investigating further why the XDP path was chosen inspite of using
vxcan. I tried looking for dummy_can.c in upstream tree but could not find
it; I might be missing something here - could you please tell where can I
find it? Meanwhile, I tried using GDB for the analysis.
I observed in the bug's strace log:
[pid 5804] bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=3, insns=0x200000c0, license="syzkaller", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_XDP, prog_btf_fd=-1, func_info_rec_size=8, func_info=NULL, func_info_cnt=0, line_info_rec_size=16, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL, ...}, 144) = 3
[pid 5804] socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 4
[pid 5804] sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\x34\x00\x00\x00\x10\x00\x01\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x40\x01\x00\x00\x00\x01\x00\x0c\x00\x2b\x80\x08\x00\x01\x00\x03\x00\x00\x00\x08\x00\x1b\x00\x00\x00\x00\x00", iov_len=52}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_DONTWAIT|MSG_FASTOPEN}, 0) = 52
[pid 5804] socket(AF_CAN, SOCK_DGRAM, CAN_ISOTP) = 5
[pid 5804] ioctl(5, SIOCGIFINDEX, {ifr_name="vxcan0", ifr_ifindex=20}) = 0
Notably, before binding vxcan0 to the CAN socket, a BPF program is loaded.
I then tried using GDB to check and got the following insights:
(gdb) b vxcan_xmit
Breakpoint 23 at 0xffffffff88ca899e: file drivers/net/can/vxcan.c, line 38.
(gdb) delete 23
(gdb) b __sys_bpf
Breakpoint 24 at 0xffffffff81d2653e: file kernel/bpf/syscall.c, line 5752.
(gdb) b bpf_prog_load
Breakpoint 25 at 0xffffffff81d2cd80: file kernel/bpf/syscall.c, line 2736.
(gdb) b vxcan_xmit if (oskb->dev->name[0]=='v' && ((oskb->dev->name[1]=='x' && oskb->dev->name[2]=='c' && oskb->dev->name[3]=='a' && oskb->dev->name[4]=='n') || (oskb->dev->name[1]=='c' && oskb->dev->name[2]=='a' && oskb->dev->name[3]=='n')))
Breakpoint 26 at 0xffffffff88ca899e: file drivers/net/can/vxcan.c, line 38.
(gdb) b __netif_receive_skb if (skb->dev->name[0]=='v' && ((skb->dev->name[1]=='x' && skb->dev->name[2]=='c' && skb->dev->name[3]=='a' && skb->dev->name[4]=='n') || (skb->dev->name[1]=='c' && skb->dev->name[2]=='a' && skb->dev->name[3]=='n')))
Breakpoint 27 at 0xffffffff8ce3c310: file net/core/dev.c, line 5798.
(gdb) b do_xdp_generic if (pskb->dev->name[0]=='v' && ((pskb->dev->name[1]=='x' && pskb->dev->name[2]=='c' && pskb->dev->name[3]=='a' && pskb->dev->name[4]=='n') || (pskb->dev->name[1]=='c' && pskb->dev->name[2]=='a' && pskb->dev->name[3]=='n')))
Breakpoint 28 at 0xffffffff8cdfccd7: file net/core/dev.c, line 5171.
(gdb) b dev_xdp_attach if (dev->name[0]=='v' && ((dev->name[1]=='x' && dev->name[2]=='c' && dev->name[3]=='a' && dev->name[4]=='n') || (dev->name[1]=='c' && dev->name[2]=='a' && dev->name[3]=='n')))
Breakpoint 29 at 0xffffffff8ce18b4e: file net/core/dev.c, line 9610.
Thread 2 hit Breakpoint 24, __sys_bpf (cmd=cmd@entry=BPF_PROG_LOAD, uattr=..., size=size@entry=144) at kernel/bpf/syscall.c:5752
5752 {
(gdb) c
Continuing.
Thread 2 hit Breakpoint 25, bpf_prog_load (attr=attr@entry=0xffff88811c987d60, uattr=..., uattr_size=144) at kernel/bpf/syscall.c:2736
2736 {
(gdb) c
Continuing.
[Switching to Thread 1.1]
Thread 1 hit Breakpoint 29, dev_xdp_attach (dev=dev@entry=0xffff888124e78000, extack=extack@entry=0xffff88811c987858, link=link@entry=0x0 <fixed_percpu_data>, new_prog=new_prog@entry=0xffffc9000a516000, old_prog=old_prog@entry=0x0 <fixed_percpu_data>, flags=flags@entry=0) at net/core/dev.c:9610
9610 {
(gdb) p dev->name
$104 = "vcan0\000\000\000\000\000\000\000\000\000\000"
(gdb) p dev->xdp_prog
$105 = (struct bpf_prog *) 0x0 <fixed_percpu_data>
(gdb) c
Continuing.
Thread 1 hit Breakpoint 29, dev_xdp_attach (dev=dev@entry=0xffff88818e918000, extack=extack@entry=0xffff88811c987858, link=link@entry=0x0 <fixed_percpu_data>, new_prog=new_prog@entry=0xffffc9000a516000, old_prog=old_prog@entry=0x0 <fixed_percpu_data>, flags=flags@entry=0) at net/core/dev.c:9610
9610 {
(gdb) p dev->name
$106 = "vxcan0\000\000\000\000\000\000\000\000\000"
(gdb) p dev->xdp_prog
$107 = (struct bpf_prog *) 0x0 <fixed_percpu_data>
(gdb) c
Continuing.
Thread 1 hit Breakpoint 29, dev_xdp_attach (dev=dev@entry=0xffff88818e910000, extack=extack@entry=0xffff88811c987858, link=link@entry=0x0 <fixed_percpu_data>, new_prog=new_prog@entry=0xffffc9000a516000, old_prog=old_prog@entry=0x0 <fixed_percpu_data>, flags=flags@entry=0) at net/core/dev.c:9610
9610 {
(gdb) p dev->name
$108 = "vxcan1\000\000\000\000\000\000\000\000\000"
(gdb) p dev->xdp_prog
$109 = (struct bpf_prog *) 0x0 <fixed_percpu_data>
(gdb) c
Continuing.
[Switching to Thread 1.2]
Here, it is attempted to attach the eariler BPF program to each of the CAN
devices present (I checked only for CAN devices since we are dealing with
effect of XDP in CAN networing stack). Earlier they didn't seem to have any
BPF program attached due to which XDP wasn't attempted for these CAN devices
earlier.
Thread 2 hit Breakpoint 26, vxcan_xmit (oskb=0xffff888115d8a400, dev=0xffff88818e918000) at drivers/net/can/vxcan.c:38
38 {
(gdb) p oskb->dev->name
$110 = "vxcan0\000\000\000\000\000\000\000\000\000"
(gdb) p oskb->dev->xdp_prog
$111 = (struct bpf_prog *) 0xffffc9000a516000
(gdb) c
Continuing.
Thread 2 hit Breakpoint 27, __netif_receive_skb (skb=skb@entry=0xffff888115d8ab00) at net/core/dev.c:5798
5798 {
(gdb) p skb->dev->name
$112 = "vxcan1\000\000\000\000\000\000\000\000\000"
(gdb) p skb->dev->xdp_prog
$113 = (struct bpf_prog *) 0xffffc9000a516000
(gdb) c
Continuing.
Thread 2 hit Breakpoint 28, do_xdp_generic (xdp_prog=0xffffc9000a516000, pskb=0xffff88843fc05af8) at net/core/dev.c:5171
5171 {
(gdb) p pskb->dev->name
$114 = "vxcan1\000\000\000\000\000\000\000\000\000"
(gdb) p pskb->dev->xdp_prog
$115 = (struct bpf_prog *) 0xffffc9000a516000
(gdb) c
Continuing.
After this, the KMSAN bug is triggered. Hence, we can conclude that due to the
BPF program loaded earlier, the CAN device undertakes generic XDP path during RX,
which is accessible even if vxcan doesn't support XDP by itself.
It seems that the way CAN devices use the headroom for storing private skb related
data might be incompatible for XPD path, due to which the generic networking stack
at RX requires to expand the head, and it is done in such a way that the yet
uninitialized expanded headroom is accesssed by can_skb_prv() using skb->head.
So, I think we can solve this bug in the following ways:
1. As you suggested earlier, access struct can_skb_priv using:
struct can_skb_priv *)(skb->data - sizeof(struct can_skb_priv)
This method ensures that the remaining CAN networking stack, which expects can_skb_priv
just before skb->data, as well as maintain compatibility with headroom expamnsion during
generic XDP.
2. Try to find some way so that XDP pathway is rejected by CAN devices at the beginning
itself, like for example in function dev_xdp_attach():
/* don't call drivers if the effective program didn't change */
if (new_prog != cur_prog) {
bpf_op = dev_xdp_bpf_op(dev, mode);
if (!bpf_op) {
NL_SET_ERR_MSG(extack, "Underlying driver does not support XDP in native mode");
return -EOPNOTSUPP;
}
err = dev_xdp_install(dev, mode, bpf_op, extack, flags, new_prog);
if (err)
return err;
}
or in some other appropriate way.
What do you think what should be done ahead?
Best Regards,
Prithvi