Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] dma-debug: enhance dma_debug_device_change() to check for mapping errors

0 views
Skip to first unread message

Shuah Khan

unread,
Nov 12, 2013, 4:00:02 PM11/12/13
to
dma-debug checks to verify if driver validated the address returned by
dma mapping routines when driver does unmap. If a driver doesn't call
unmap, failure to check mapping errors isn't detected and reported.

Enhancing existing bus notifier_call dma_debug_device_change() to check
for mapping errors at the same time it detects leaked dma buffers for
BUS_NOTIFY_UNBOUND_DRIVER event.

Signed-off-by: Shuah Khan <shua...@samsung.com>
---
lib/dma-debug.c | 23 +++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index d87a17a..6c17b90 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -712,6 +712,19 @@ out_err:
return -ENOMEM;
}

+static void check_dma_mapping_error(struct dma_debug_entry *ref,
+ struct dma_debug_entry *entry)
+{
+ if (entry->map_err_type == MAP_ERR_NOT_CHECKED) {
+ err_printk(ref->dev, entry,
+ "DMA-API: device driver failed to check map error"
+ "[device address=0x%016llx] [size=%llu bytes] "
+ "[mapped as %s]",
+ ref->dev_addr, ref->size,
+ type2name[entry->type]);
+ }
+}
+
static int device_dma_allocations(struct device *dev, struct dma_debug_entry **out_entry)
{
struct dma_debug_entry *entry;
@@ -724,6 +737,7 @@ static int device_dma_allocations(struct device *dev, struct dma_debug_entry **o
spin_lock(&dma_entry_hash[i].lock);
list_for_each_entry(entry, &dma_entry_hash[i].list, list) {
if (entry->dev == dev) {
+ check_dma_mapping_error(entry, entry);
count += 1;
*out_entry = entry;
}
@@ -928,14 +942,7 @@ static void check_unmap(struct dma_debug_entry *ref)
dir2name[ref->direction]);
}

- if (entry->map_err_type == MAP_ERR_NOT_CHECKED) {
- err_printk(ref->dev, entry,
- "DMA-API: device driver failed to check map error"
- "[device address=0x%016llx] [size=%llu bytes] "
- "[mapped as %s]",
- ref->dev_addr, ref->size,
- type2name[entry->type]);
- }
+ check_dma_mapping_error(ref, entry);

hash_bucket_del(entry);
dma_entry_free(entry);
--
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Alexander Duyck

unread,
Nov 12, 2013, 5:20:02 PM11/12/13
to
I think this might be overdoing the error checking by a bit. I would
much rather have the DMA leaked error be visible than have it buried
under messages about the failure to check for DMA errors. In my mind
the DMA buffer leak is much more serious than the failure to check for
mapping errors.

Thanks,

Alex

Shuah Khan

unread,
Nov 12, 2013, 5:30:02 PM11/12/13
to
On 11/12/2013 03:12 PM, Alexander Duyck wrote:
> I think this might be overdoing the error checking by a bit. I would
> much rather have the DMA leaked error be visible than have it buried
> under messages about the failure to check for DMA errors. In my mind
> the DMA buffer leak is much more serious than the failure to check for
> mapping errors.
>

I agree DMA leak message is more important. Would you like to see just
one message if mapping errors are found? Instead of printing the entire
stacktrace like it does now, it could just do warn.

-- Shuah


--
Shuah Khan
Senior Linux Kernel Developer - Open Source Group
Samsung Research America(Silicon Valley)
shua...@samsung.com | (970) 672-0658

Alexander Duyck

unread,
Nov 12, 2013, 7:30:02 PM11/12/13
to
On 11/12/2013 02:20 PM, Shuah Khan wrote:
> On 11/12/2013 03:12 PM, Alexander Duyck wrote:
>> I think this might be overdoing the error checking by a bit. I would
>> much rather have the DMA leaked error be visible than have it buried
>> under messages about the failure to check for DMA errors. In my mind
>> the DMA buffer leak is much more serious than the failure to check for
>> mapping errors.
>>
>
> I agree DMA leak message is more important. Would you like to see just
> one message if mapping errors are found? Instead of printing the
> entire stacktrace like it does now, it could just do warn.
>
> -- Shuah
>
>

I think only one message would be preferred. If nothing else you might
update the function so that it only dumps the DMA mapping error check on
the first one found and then goes quiet after that. There is no point
in displaying the same error message multiple times since that is likely
what we would see.

Thanks,

Alex

Shuah Khan

unread,
Nov 13, 2013, 4:00:02 PM11/13/13
to
dma-debug checks to verify if driver validated the address returned by
dma mapping routines when driver does unmap. If a driver doesn't call
unmap, failure to check mapping errors isn't detected and reported.

Enhancing existing bus notifier_call dma_debug_device_change() to check
for mapping errors at the same time it detects leaked dma buffers for
BUS_NOTIFY_UNBOUND_DRIVER event. It scans for mapping errors and if any
found, prints one warning message that includes mapping error count.

Signed-off-by: Shuah Khan <shua...@samsung.com>
---
lib/dma-debug.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index d87a17a..e34865e 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -717,6 +717,7 @@ static int device_dma_allocations(struct device *dev, struct dma_debug_entry **o
struct dma_debug_entry *entry;
unsigned long flags;
int count = 0, i;
+ int map_err_cnt = 0;

local_irq_save(flags);

@@ -724,6 +725,8 @@ static int device_dma_allocations(struct device *dev, struct dma_debug_entry **o
spin_lock(&dma_entry_hash[i].lock);
list_for_each_entry(entry, &dma_entry_hash[i].list, list) {
if (entry->dev == dev) {
+ if (entry->map_err_type == MAP_ERR_NOT_CHECKED)
+ map_err_cnt += 1;
count += 1;
*out_entry = entry;
}
@@ -733,6 +736,10 @@ static int device_dma_allocations(struct device *dev, struct dma_debug_entry **o

local_irq_restore(flags);

+ if (map_err_cnt)
+ dev_warn(entry->dev,
+ "DMA-API: device driver failed to check map errors: "
+ "[count] = %d\n", map_err_cnt);
return count;
}

--
1.8.3.2

Joerg Roedel

unread,
Dec 30, 2013, 9:20:02 AM12/30/13
to
I think it is better to check for this in a seperate function and use
err_printk instead of dev_warn in the end to print the errors.
The new function can then be called in the dma_debug_device_change
callback like device_dma_allocations is.


Joerg

Shuah Khan

unread,
Jan 3, 2014, 1:30:03 PM1/3/14
to
I did explore separate function option and backed off from it since the
new routine will have to duplicate what device_dma_allocations() does
except that it checks for entry->map_err_type.

I still have the patch that does that saved away. If you still prefer
that approach, I can rework the patch and send it.

-- Shuah


--
Shuah Khan
Senior Linux Kernel Developer - Open Source Group
Samsung Research America(Silicon Valley)
shua...@samsung.com | (970) 672-0658

Joerg Roedel

unread,
Jan 7, 2014, 9:30:02 AM1/7/14
to
On Fri, Jan 03, 2014 at 11:26:04AM -0700, Shuah Khan wrote:
> On 12/30/2013 07:15 AM, Joerg Roedel wrote:
> >I think it is better to check for this in a seperate function and use
> >err_printk instead of dev_warn in the end to print the errors.
> >The new function can then be called in the dma_debug_device_change
> >callback like device_dma_allocations is.
> >
>
> I did explore separate function option and backed off from it since
> the new routine will have to duplicate what device_dma_allocations()
> does except that it checks for entry->map_err_type.
>
> I still have the patch that does that saved away. If you still
> prefer that approach, I can rework the patch and send it.

You can get rid of the code duplication by defining for_each macros for
the list traversals. So I would still prefer to have this check in a
seperate function.


Joerg

Shuah Khan

unread,
Jan 7, 2014, 10:10:02 AM1/7/14
to
On 01/07/2014 07:26 AM, Joerg Roedel wrote:
> On Fri, Jan 03, 2014 at 11:26:04AM -0700, Shuah Khan wrote:
>> On 12/30/2013 07:15 AM, Joerg Roedel wrote:
>>> I think it is better to check for this in a seperate function and use
>>> err_printk instead of dev_warn in the end to print the errors.
>>> The new function can then be called in the dma_debug_device_change
>>> callback like device_dma_allocations is.
>>>
>>
>> I did explore separate function option and backed off from it since
>> the new routine will have to duplicate what device_dma_allocations()
>> does except that it checks for entry->map_err_type.
>>
>> I still have the patch that does that saved away. If you still
>> prefer that approach, I can rework the patch and send it.
>
> You can get rid of the code duplication by defining for_each macros for
> the list traversals. So I would still prefer to have this check in a
> seperate function.
>
>
> Joerg
>
>

Joerg,

This patch and a follow-on cocinelli warning fix patch are in
linux-next. Would you like me to send a patch relative to the change in
linux-next or cut a new patch against the latest Linus's git. I can go
either way. We just have to remember to drop those two patches from
linux-next.

-- Shuah

--
Shuah Khan
Senior Linux Kernel Developer - Open Source Group
Samsung Research America(Silicon Valley)
shua...@samsung.com | (970) 672-0658

Joerg Roedel

unread,
Jan 7, 2014, 10:20:02 AM1/7/14
to
On Tue, Jan 07, 2014 at 08:00:33AM -0700, Shuah Khan wrote:
> This patch and a follow-on cocinelli warning fix patch are in
> linux-next. Would you like me to send a patch relative to the change
> in linux-next or cut a new patch against the latest Linus's git. I
> can go either way. We just have to remember to drop those two
> patches from linux-next.

Please do a patch against linus.git and drop the two patches from
linux-next. Over which tree did they go into linux-next anyway?


Joerg

Shuah Khan

unread,
Jan 7, 2014, 12:30:02 PM1/7/14
to
On 01/07/2014 08:12 AM, Joerg Roedel wrote:
> On Tue, Jan 07, 2014 at 08:00:33AM -0700, Shuah Khan wrote:
>> This patch and a follow-on cocinelli warning fix patch are in
>> linux-next. Would you like me to send a patch relative to the change
>> in linux-next or cut a new patch against the latest Linus's git. I
>> can go either way. We just have to remember to drop those two
>> patches from linux-next.
>
> Please do a patch against linus.git and drop the two patches from
> linux-next. Over which tree did they go into linux-next anyway?
>
>

ok. It went through mm tree.

-- Shuah


--
Shuah Khan
Senior Linux Kernel Developer - Open Source Group
Samsung Research America(Silicon Valley)
shua...@samsung.com | (970) 672-0658

Shuah Khan

unread,
Jan 10, 2014, 7:30:02 PM1/10/14
to
On 01/07/2014 10:22 AM, Shuah Khan wrote:
> On 01/07/2014 08:12 AM, Joerg Roedel wrote:
>> On Tue, Jan 07, 2014 at 08:00:33AM -0700, Shuah Khan wrote:
>>> This patch and a follow-on cocinelli warning fix patch are in
>>> linux-next. Would you like me to send a patch relative to the change
>>> in linux-next or cut a new patch against the latest Linus's git. I
>>> can go either way. We just have to remember to drop those two
>>> patches from linux-next.
>>
>> Please do a patch against linus.git and drop the two patches from
>> linux-next. Over which tree did they go into linux-next anyway?
>>
>>
>
> ok. It went through mm tree.
>

I have a separate routine now for scanning for entries with map error
flag set and I also have a new err_printk() call to print the entry from
dma_debug_device_change() right after device_dma_allocations() check and
corresponding err_printk(). Snippet of the diff below. Only one warning
gets printed. Looking at err_printk(): if (!show_all_errors &&
show_num_errors > 0) conditional, it appears that is the way it is
supposed to work, unless I am missing something

One way to not miss the second warning is to just generate just a
pr_err() for the mapping error instead of err_printk(). Leaked entries
is a bigger problem that should really be shown and the same entry might
not have the map error flag set. Thoughts?

static int dma_debug_device_change(struct notifier_block *nb, unsigned
long action, void *data)
{
struct device *dev = data;
@@ -758,6 +784,18 @@ static int dma_debug_device_change(struct
notifier_block *nb, unsigned long acti
"[mapped with %s] [mapped as %s]\n",
count, entry->dev_addr, entry->size,
dir2name[entry->direction],
type2name[entry->type]);
+
+ count = device_dma_map_errors(dev, &entry);
+ if (count == 0)
+ break;
+ err_printk(dev, entry,
+ "DMA-API: device driver failed to check
map err"
+ ": [count=%d]\n"
+ "Details of an entry with map error flag
set: "
+ "[device address=0x%016llx] [size=%llu
bytes] "
+ "[mapped with %s] [mapped as %s]\n",
+ count, entry->dev_addr, entry->size,
+ dir2name[entry->direction], type2name[entry->type]);
break;
0 new messages