BUG: Out of bounds read in hci_le_ext_adv_report_evt()

51 views
Skip to first unread message

马麟

unread,
Mar 21, 2021, 3:19:02 AM3/21/21
to mar...@holtmann.org, johan....@gmail.com, luiz....@gmail.com, da...@davemloft.net, ku...@kernel.org, linux-b...@vger.kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, yajin...@zju.edu.cn, syzk...@googlegroups.com
Hi there:

Our team, zjublocksec, found the following problem during fuzzing, which seems undiscovered in previous.

==== Basic Information =========================

HEAD commit: 1e28eed17697bcf343c6743f0028cc3b5dd88bf0 (tag: v5.12-rc3, master)
Kernel config: refer to attached file (config)
C POC code: refer to attached file (poc.c)

==== KASAN Output =========================

[ 20.294394] BUG: KASAN: slab-out-of-bounds in hci_le_meta_evt+0x310b/0x3850
[ 20.300333] Read of size 2 at addr ffff888013805819 by task kworker/u5:0/53
[ 20.306227]
[ 20.307601] CPU: 0 PID: 53 Comm: kworker/u5:0 Not tainted 5.12.0-rc3+ #5
[ 20.313304] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 20.323006] Workqueue: hci0 hci_rx_work
[ 20.326303] Call Trace:
[ 20.328466] dump_stack+0xdd/0x137
[ 20.331425] ? hci_le_meta_evt+0x310b/0x3850
[ 20.335099] ? hci_le_meta_evt+0x310b/0x3850
[ 20.338773] print_address_description.constprop.0+0x18/0x130
[ 20.343697] ? hci_le_meta_evt+0x310b/0x3850
[ 20.347383] ? hci_le_meta_evt+0x310b/0x3850
[ 20.351059] kasan_report.cold+0x7f/0x111
[ 20.354512] ? hci_le_meta_evt+0x310b/0x3850
[ 20.358187] hci_le_meta_evt+0x310b/0x3850
[ 20.361722] ? run_timer_softirq+0x120/0x120
[ 20.365402] ? queue_work_on+0x69/0xa0
[ 20.368654] ? del_timer+0xb6/0x100
[ 20.371673] ? kasan_set_track+0x1c/0x30
[ 20.375062] ? le_conn_complete_evt+0x16e0/0x16e0
[ 20.379092] ? skb_release_data+0x519/0x610
[ 20.382686] ? kfree+0x91/0x270
[ 20.385413] ? kasan_set_track+0x1c/0x30
[ 20.388797] ? mutex_lock+0x89/0xd0
[ 20.391835] ? __mutex_lock_slowpath+0x10/0x10
[ 20.395651] ? hci_event_packet+0x436/0xa100
[ 20.399327] ? bt_dbg+0xe1/0x130
[ 20.402118] hci_event_packet+0x3213/0xa100
[ 20.405712] ? _raw_write_lock_irqsave+0xd0/0xd0
[ 20.409672] ? bt_dbg+0xe1/0x130
[ 20.412489] ? bt_dbg+0xe1/0x130
[ 20.415304] ? bt_err_ratelimited+0x140/0x140
[ 20.419059] ? hci_cmd_status_evt+0x46a0/0x46a0
[ 20.422955] ? bt_dbg+0xe1/0x130
[ 20.425754] ? bt_err_ratelimited+0x50/0x140
[ 20.429429] ? __wake_up_common_lock+0xde/0x130
[ 20.433333] ? __wake_up_common+0x5d0/0x5d0
[ 20.436926] ? _raw_spin_lock_irqsave+0x7b/0xd0
[ 20.440844] ? hci_chan_sent+0x23/0x800
[ 20.444167] ? __sanitizer_cov_trace_switch+0x50/0x90
[ 20.448504] ? _raw_spin_lock_irqsave+0x7b/0xd0
[ 20.452396] ? bt_dbg+0xe1/0x130
[ 20.455205] ? bt_err_ratelimited+0x140/0x140
[ 20.458961] ? _raw_spin_lock_irqsave+0x7b/0xd0
[ 20.462847] ? _raw_write_lock_irqsave+0xd0/0xd0
[ 20.466832] ? copy_fpregs_to_fpstate+0x14f/0x1d0
[ 20.470904] hci_rx_work+0x2b9/0x8e0
[ 20.473993] ? strscpy+0xa0/0x2a0
[ 20.476905] process_one_work+0x747/0xfe0
[ 20.480392] ? kthread_data+0x4f/0xc0
[ 20.483561] worker_thread+0x641/0x1190
[ 20.486883] ? rescuer_thread+0xd00/0xd00
[ 20.490332] kthread+0x344/0x410
[ 20.493127] ? kthread_create_worker_on_cpu+0xf0/0xf0
[ 20.497457] ret_from_fork+0x22/0x30
[ 20.500563]
[ 20.501919] Allocated by task 223:
[ 20.504882] kasan_save_stack+0x1b/0x40
[ 20.508212] __kasan_kmalloc+0x7a/0x90
[ 20.511459] load_elf_phdrs+0x103/0x210
[ 20.514763] load_elf_binary+0x1dc/0x4dd0
[ 20.518220] bprm_execve+0x741/0x1460
[ 20.521401] do_execveat_common+0x621/0x7c0
[ 20.525013] __x64_sys_execve+0x8f/0xc0
[ 20.528354] do_syscall_64+0x33/0x40
[ 20.531465] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 20.535791]
[ 20.537154] The buggy address belongs to the object at ffff888013805600
[ 20.537154] which belongs to the cache kmalloc-512 of size 512
[ 20.547717] The buggy address is located 25 bytes to the right of
[ 20.547717] 512-byte region [ffff888013805600, ffff888013805800)
[ 20.557963] The buggy address belongs to the page:
[ 20.562066] page:00000000ef0b1214 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888013802000 pfn:0x13800
[ 20.571028] head:00000000ef0b1214 order:3 compound_mapcount:0 compound_pincount:0
[ 20.577371] flags: 0x100000000010200(slab|head)
[ 20.581264] raw: 0100000000010200 ffff888006441450 ffffea0000473408 ffff888006443940
[ 20.587833] raw: ffff888013802000 000000000015000c 00000001ffffffff 0000000000000000
[ 20.594388] page dumped because: kasan: bad access detected
[ 20.599157]
[ 20.600503] Memory state around the buggy address:
[ 20.604598] ffff888013805700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 20.610722] ffff888013805780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 20.616835] >ffff888013805800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 20.622963] ^
[ 20.626401] ffff888013805880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 20.632507] ffff888013805900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

==== Bug Analysis =========================

In fact, this out-of-bounds read is quite similar to an old found bug (KASAN: out-of-bounds read in hci_le_direct_adv_report_evt). You can check this link to get useful information: https://groups.google.com/g/syzkaller-bugs/c/Z9-x9udEIxk/m/0NsClcU4BAAJ

Anyhow, the buggy code for this time is shown below:

static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb)
{
u8 num_reports = skb->data[0];
void *ptr = &skb->data[1];

hci_dev_lock(hdev);

while (num_reports--) {
struct hci_ev_le_ext_adv_report *ev = ptr;
u8 legacy_evt_type;
u16 evt_type;

evt_type = __le16_to_cpu(ev->evt_type);
legacy_evt_type = ext_evt_type_to_legacy(hdev, evt_type);
if (legacy_evt_type != LE_ADV_INVALID) {
process_adv_report(hdev, legacy_evt_type, &ev->bdaddr,
ev->bdaddr_type, NULL, 0, ev->rssi,
ev->data, ev->length,
!(evt_type & LE_EXT_ADV_LEGACY_PDU));
}

ptr += sizeof(*ev) + ev->length;
}

hci_dev_unlock(hdev);
}

As you can see, the variable `num_reports` is not being properly checked. The malformed event packet can fake a huge `num_reports` and cause `process_adv_report` to access invalid memory space. Yeah, the internal of this bug is almost equivalent to the already found bug.

==== Suggested Patch =========================

As this bug is quite similar to that found one, it's recommended to adopt a similar patch here like below (also in the attached file: patch.diff).

--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -5685,10 +5685,14 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb)
{
u8 num_reports = skb->data[0];
void *ptr = &skb->data[1];
+ u32 len_processed = 0;

hci_dev_lock(hdev);

while (num_reports--) {
+ if (len_processed > skb->len)
+ break;
+
struct hci_ev_le_ext_adv_report *ev = ptr;
u8 legacy_evt_type;
u16 evt_type;
@@ -5703,6 +5707,7 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb)
}

ptr += sizeof(*ev) + ev->length;
+ len_processed += sizeof(*ev) + ev->length;
}

hci_dev_unlock(hdev);

The idea here is just to prevent the `ptr` to go over bound of the `skb->len`. After testing, the reproducer code will not work out against this fix. :)

==== Others =========================

Please let me know if there is any confuses.
Best wishes!

poc.c
config
patch.diff

Luiz Augusto von Dentz

unread,
Mar 21, 2021, 5:06:25 PM3/21/21
to 马麟, Marcel Holtmann, Johan Hedberg, David Miller, Jakub Kicinski, linux-b...@vger.kernel.org, open list:NETWORKING [GENERAL], Linux Kernel Mailing List, yajin...@zju.edu.cn, syzk...@googlegroups.com
Hi,
Or we do something like
https://lore.kernel.org/linux-bluetooth/20201024002251.13...@gmail.com/,
that said the reason we didn't applied my patches was that the
controller would be the one generating invalid data, but it seems you
are reproducing with vhci controller which is only used for emulating
a controller and requires root privileges so it is unlikely these
conditions would happens with hardware itself, in the other hand as
there seems to be more and more reports using vhci to emulate broken
events it perhaps more productive to introduce proper checks for all
events so we don't have to deal with more reports like this in the
future.

> ==== Others =========================
>
> Please let me know if there is any confuses.
> Best wishes!



--
Luiz Augusto von Dentz

Emil Lenngren

unread,
Mar 21, 2021, 7:23:15 PM3/21/21
to Luiz Augusto von Dentz, 马麟, Marcel Holtmann, Johan Hedberg, David Miller, Jakub Kicinski, linux-b...@vger.kernel.org, open list:NETWORKING [GENERAL], Linux Kernel Mailing List, yajin...@zju.edu.cn, syzk...@googlegroups.com
Hi,

Den mån 22 mars 2021 kl 00:01 skrev Luiz Augusto von Dentz
<luiz....@gmail.com>:
> Or we do something like
> https://lore.kernel.org/linux-bluetooth/20201024002251.13...@gmail.com/,
> that said the reason we didn't applied my patches was that the
> controller would be the one generating invalid data, but it seems you
> are reproducing with vhci controller which is only used for emulating
> a controller and requires root privileges so it is unlikely these
> conditions would happens with hardware itself, in the other hand as
> there seems to be more and more reports using vhci to emulate broken
> events it perhaps more productive to introduce proper checks for all
> events so we don't have to deal with more reports like this in the
> future.

Keep in mind that when using the H4 uart protocol without any error
correction (as H5 has), it is possible that random bit errors occur on
the wire. I wouldn't like my kernel to crash due to this. Bit errors
happen all the time on RPi 4 for example at the default baud rate if
you just do some heavy stress testing, or use an application that
transfers a lot of data over Bluetooth.

/Emil

Luiz Augusto von Dentz

unread,
Mar 22, 2021, 2:04:02 AM3/22/21
to Emil Lenngren, 马麟, Marcel Holtmann, Johan Hedberg, David Miller, Jakub Kicinski, linux-b...@vger.kernel.org, open list:NETWORKING [GENERAL], Linux Kernel Mailing List, yajin...@zju.edu.cn, syzk...@googlegroups.com
Hi Emil,
While we can catch some errors like that, and possible avoid crashes,
this should be limited to just boundary checks and not actually error
correction, that I'm afraid is out of our hands since we can still
receive an event that does match the original packet size but meant
something else which may break the synchronization of the states
between the controller and the host, also perhaps we need to notify
this type of error since even if we start discarding the events that
can possible cause states to be out of sync and the controller will
need to be reset in order to recover.
Reply all
Reply to author
Forward
0 new messages