Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed

79 views
Skip to first unread message

sunil....@gmail.com

unread,
Apr 17, 2017, 8:00:05 AM4/17/17
to
From: Sunil Goutham <sgou...@cavium.com>

For software initiated address translation, when domain type is
IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
i.e return the same IOVA as translated address.

This patch is an extension to Will Deacon's patchset
"Implement SMMU passthrough using the default domain".

Signed-off-by: Sunil Goutham <sgou...@cavium.com>
---
drivers/iommu/arm-smmu.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 41afb07..2f4a130 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1405,6 +1405,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;

+ if (domain->type == IOMMU_DOMAIN_IDENTITY)
+ return iova;
+
if (!ops)
return 0;

--
2.7.4

Sunil Kovvuri

unread,
Apr 20, 2017, 12:30:05 AM4/20/17
to
Any comments or is this patch accepted ?

Thanks,
Sunil.

Will Deacon

unread,
Apr 24, 2017, 10:50:09 AM4/24/17
to
On Mon, Apr 17, 2017 at 05:27:26PM +0530, sunil....@gmail.com wrote:
> From: Sunil Goutham <sgou...@cavium.com>
>
> For software initiated address translation, when domain type is
> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
> i.e return the same IOVA as translated address.
>
> This patch is an extension to Will Deacon's patchset
> "Implement SMMU passthrough using the default domain".

Are you actually seeing an issue here? If so, why isn't SMMUv3 affected too?

> Signed-off-by: Sunil Goutham <sgou...@cavium.com>
> ---
> drivers/iommu/arm-smmu.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 41afb07..2f4a130 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1405,6 +1405,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
>
> + if (domain->type == IOMMU_DOMAIN_IDENTITY)
> + return iova;
> +
> if (!ops)
> return 0;

I'd have thought ops would be NULL, since arm_smmu_init_domain_context
doesn't allocate them for an identity domain.

I don't understand this patch. Please can you explain the problem more
clearly?

Will

Sunil Kovvuri

unread,
Apr 24, 2017, 12:00:05 PM4/24/17
to
On Mon, Apr 24, 2017 at 8:14 PM, Will Deacon <will....@arm.com> wrote:
> On Mon, Apr 17, 2017 at 05:27:26PM +0530, sunil....@gmail.com wrote:
>> From: Sunil Goutham <sgou...@cavium.com>
>>
>> For software initiated address translation, when domain type is
>> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
>> i.e return the same IOVA as translated address.
>>
>> This patch is an extension to Will Deacon's patchset
>> "Implement SMMU passthrough using the default domain".
>
> Are you actually seeing an issue here? If so, why isn't SMMUv3 affected too?
Yes and SMMUv3 should also be effected but as of now I don't see any use case.
If needed, i can re-submit the patch with changes in SMMUv3 as well.

>
>> Signed-off-by: Sunil Goutham <sgou...@cavium.com>
>> ---
>> drivers/iommu/arm-smmu.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 41afb07..2f4a130 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -1405,6 +1405,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
>> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
>>
>> + if (domain->type == IOMMU_DOMAIN_IDENTITY)
>> + return iova;
>> +
>> if (!ops)
>> return 0;
>
> I'd have thought ops would be NULL, since arm_smmu_init_domain_context
> doesn't allocate them for an identity domain.
Yes ops is set to NULL.

>
> I don't understand this patch. Please can you explain the problem more
> clearly?
AFAIK for any driver outside IOMMU there is only one way to identify
if device is attached to
IOMMU or not and that is by checking iommu_domain. And I don't think
it would be appropriate
for the driver to check domain->type before calling 'iommu_iova_to_phys()' API.

The difference between IOMMU disabled and IOMMU being in passthrough
mode is that, in the
later case device is still attached to default domain but in former's
case it's NULL. So there is no
way to differentiate for the external driver whether IOMMU is in
passthrough mode or DMA mode.

And since ops is NULL in passthrough mode, 'iommu_iova_to_phys()' will
return zero.

Use case for your reference
https://lkml.org/lkml/2017/3/7/299
This driver is for a NIC interface on platform which supports SMMUv2.

Let me know if any more details are needed.

Thanks,
Sunil.

>
> Will

Will Deacon

unread,
Apr 24, 2017, 12:10:07 PM4/24/17
to
On Mon, Apr 24, 2017 at 09:23:16PM +0530, Sunil Kovvuri wrote:
> On Mon, Apr 24, 2017 at 8:14 PM, Will Deacon <will....@arm.com> wrote:
> > On Mon, Apr 17, 2017 at 05:27:26PM +0530, sunil....@gmail.com wrote:
> >> From: Sunil Goutham <sgou...@cavium.com>
> >>
> >> For software initiated address translation, when domain type is
> >> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
> >> i.e return the same IOVA as translated address.
> >>
> >> This patch is an extension to Will Deacon's patchset
> >> "Implement SMMU passthrough using the default domain".
> >
> > Are you actually seeing an issue here? If so, why isn't SMMUv3 affected too?
> Yes and SMMUv3 should also be effected but as of now I don't see any use case.
> If needed, i can re-submit the patch with changes in SMMUv3 as well.

Yes, please.

> >> Signed-off-by: Sunil Goutham <sgou...@cavium.com>
> >> ---
> >> drivers/iommu/arm-smmu.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >> index 41afb07..2f4a130 100644
> >> --- a/drivers/iommu/arm-smmu.c
> >> +++ b/drivers/iommu/arm-smmu.c
> >> @@ -1405,6 +1405,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
> >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >> struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
> >>
> >> + if (domain->type == IOMMU_DOMAIN_IDENTITY)
> >> + return iova;
> >> +
> >> if (!ops)
> >> return 0;
> >
> > I'd have thought ops would be NULL, since arm_smmu_init_domain_context
> > doesn't allocate them for an identity domain.
> Yes ops is set to NULL.

Argh, sorry, I completely overlooked that we return 0 in that case, rather
than the iova.

> > I don't understand this patch. Please can you explain the problem more
> > clearly?
> AFAIK for any driver outside IOMMU there is only one way to identify
> if device is attached to
> IOMMU or not and that is by checking iommu_domain. And I don't think
> it would be appropriate
> for the driver to check domain->type before calling 'iommu_iova_to_phys()' API.
>
> The difference between IOMMU disabled and IOMMU being in passthrough
> mode is that, in the
> later case device is still attached to default domain but in former's
> case it's NULL. So there is no
> way to differentiate for the external driver whether IOMMU is in
> passthrough mode or DMA mode.
>
> And since ops is NULL in passthrough mode, 'iommu_iova_to_phys()' will
> return zero.
>
> Use case for your reference
> https://lkml.org/lkml/2017/3/7/299
> This driver is for a NIC interface on platform which supports SMMUv2.

Blimey, that driver is horrible, but I take your point on the API. Please
repost, fixing SMMUv3 at the same time.

Will

Sunil Kovvuri

unread,
Apr 24, 2017, 12:30:06 PM4/24/17
to
Sure, will re-submit the patch with SMMUv3 changes.

On a separate note, if you have time, I would definitely like to know
your feedback and what's horrible in that driver, probably in a different
email to keep that out of scope of this patch.

Thanks,
Sunil.

sunil....@gmail.com

unread,
Apr 25, 2017, 6:10:05 AM4/25/17
to
From: Sunil Goutham <sgou...@cavium.com>

For software initiated address translation, when domain type is
IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
i.e return the same IOVA as translated address.

This patch is an extension to Will Deacon's patchset
"Implement SMMU passthrough using the default domain".

Signed-off-by: Sunil Goutham <sgou...@cavium.com>
---

V2
- As per Will's suggestion applied fix to SMMUv3 driver as well.

drivers/iommu/arm-smmu-v3.c | 3 +++
drivers/iommu/arm-smmu.c | 3 +++
2 files changed, 6 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 05b4592..d412bdd 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1714,6 +1714,9 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;

+ if (domain->type == IOMMU_DOMAIN_IDENTITY)
+ return iova;
+
if (!ops)
return 0;

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bfab4f7..81088cd 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1459,6 +1459,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;

+ if (domain->type == IOMMU_DOMAIN_IDENTITY)
+ return iova;
+
if (!ops)
return 0;

--
2.7.4

Sunil Kovvuri

unread,
Apr 26, 2017, 5:30:06 AM4/26/17
to
Will,

if you are okay with the patch, can you please ACK.

Thanks,
Sunil.

Will Deacon

unread,
Apr 26, 2017, 6:10:06 AM4/26/17
to
Hi Sunil,

On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil....@gmail.com wrote:
> From: Sunil Goutham <sgou...@cavium.com>
>
> For software initiated address translation, when domain type is
> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
> i.e return the same IOVA as translated address.
>
> This patch is an extension to Will Deacon's patchset
> "Implement SMMU passthrough using the default domain".
>
> Signed-off-by: Sunil Goutham <sgou...@cavium.com>
> ---
>
> V2
> - As per Will's suggestion applied fix to SMMUv3 driver as well.

This follows what the AMD driver does, so:

Acked-by: Will Deacon <will....@arm.com>

but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c
poke around with the physical address to get at the struct pages underlying
a DMA buffer is really dodgy. Is there no way this can be avoided, perhaps
by tracking the pages some other way (although I don't understand why you're
having to mess with the page reference counts to start with)?

At least, I think you should be checking the domain type in
nicvf_iova_to_phys, which clearly expects a DMA domain if one exists at all.

Joerg: sorry, this is another one for you to pick up if possible.

Cheers,

Will

Joerg Roedel

unread,
Apr 26, 2017, 6:40:06 AM4/26/17
to
On Wed, Apr 26, 2017 at 11:01:50AM +0100, Will Deacon wrote:
> Joerg: sorry, this is another one for you to pick up if possible.

Applied.

Sunil Kovvuri

unread,
Apr 26, 2017, 6:50:08 AM4/26/17
to
On Wed, Apr 26, 2017 at 3:31 PM, Will Deacon <will....@arm.com> wrote:
> Hi Sunil,
>
> On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil....@gmail.com wrote:
>> From: Sunil Goutham <sgou...@cavium.com>
>>
>> For software initiated address translation, when domain type is
>> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
>> i.e return the same IOVA as translated address.
>>
>> This patch is an extension to Will Deacon's patchset
>> "Implement SMMU passthrough using the default domain".
>>
>> Signed-off-by: Sunil Goutham <sgou...@cavium.com>
>> ---
>>
>> V2
>> - As per Will's suggestion applied fix to SMMUv3 driver as well.
>
> This follows what the AMD driver does, so:
>
> Acked-by: Will Deacon <will....@arm.com>

Thanks,

>
> but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> poke around with the physical address to get at the struct pages underlying
> a DMA buffer is really dodgy.

Driver is not dealing with page structures to be precise, just like
for any other NIC device, driver needs to know the virtual address
of the packet to where it's DMA'ed, so that SKB if framed and
handed over to network stack. Due to reasons mentioned below,
in this driver it's not possible to maintain a list of DMA addresses to
Virtual address mappings. Hence using IOMMU API, DMA address
is translated to physical address and finally to virtual address. I don't
see anything dodgy here.

> Is there no way this can be avoided, perhaps by tracking the pages some other way

I have explained that in the commit message
--
Also VNIC doesn't have a seperate receive buffer ring per receive
queue, so there is no 1:1 descriptor index matching between CQE_RX
and the index in buffer ring from where a buffer has been used for
DMA'ing. Unlike other NICs, here it's not possible to maintain dma
address to virt address mappings within the driver. This leaves us
no other choice but to use IOMMU's IOVA address conversion API to
get buffer's virtual address which can be given to network stack
for processing.
--

>(although I don't understand why you're having to mess with the page reference
>counts to start with)?
Not sure why you say it's a mess, adjusting page reference counts is quite
common if you check other NIC drivers. On ARM64 especially when using
64KB pages, if we have only one packet buffer for each page then we
will have to set aside a whole lot of memory which sometimes is not possible
on embedded platforms. Hence multiple pkt buffers per page, and page reference
is set accordingly.

>
> At least, I think you should be checking the domain type in
> nicvf_iova_to_phys, which clearly expects a DMA domain if one exists at all.

Probably, but I don't think network maintainers would be okay with it, since
such stuff should be hidden from a network driver's point of view. In reverse
the argument can be that NIC driver shouldn't even have to check if domain
is set or not.

Thanks,
Sunil.

Will Deacon

unread,
Apr 26, 2017, 7:40:05 AM4/26/17
to
It's dodgy because you're the only NIC driver using iommu_iova_to_phys
directly and, afaict, the driver could just stash either the struct page
or the virtual address at the point of allocation.

> > Is there no way this can be avoided, perhaps by tracking the pages some other way
>
> I have explained that in the commit message
> --
> Also VNIC doesn't have a seperate receive buffer ring per receive
> queue, so there is no 1:1 descriptor index matching between CQE_RX
> and the index in buffer ring from where a buffer has been used for
> DMA'ing. Unlike other NICs, here it's not possible to maintain dma
> address to virt address mappings within the driver. This leaves us
> no other choice but to use IOMMU's IOVA address conversion API to
> get buffer's virtual address which can be given to network stack
> for processing.
> --
>
> >(although I don't understand why you're having to mess with the page reference
> >counts to start with)?
> Not sure why you say it's a mess, adjusting page reference counts is quite
> common if you check other NIC drivers. On ARM64 especially when using
> 64KB pages, if we have only one packet buffer for each page then we
> will have to set aside a whole lot of memory which sometimes is not possible
> on embedded platforms. Hence multiple pkt buffers per page, and page reference
> is set accordingly.

I wasn't saying that was a mess, I was just saying that I didn't understand
why you mess (verb) with the page reference counts (my ignorance of the
network layer). The code that I think is a mess is:

phys_addr = nicvf_iova_to_phys(nic, buf_addr);
[...]
put_page(virt_to_page(phys_to_virt(phys_addr)));

because:

(a) You have the information you need at allocation time, but you've
failed to record that and are trying to use the IOMMU API to
reconstruct the CPU virtual address

(b) When there isn't an IOMMU present, you assume that bus addresses ==
physical addresses

(c) You assume that the DMA buffer is mapped in the linear mapping

that's probably all true for ThunderX/arm64, but it's generally not portable
or reliable code. If you could get a handle to the struct page that you
allocated in the first place, then you could use page_address to get its
virtual address instead of having to go via the physical address.

Will

Sunil Kovvuri

unread,
Apr 26, 2017, 8:10:06 AM4/26/17
to
Well the driver needs to be written based on how HW functions even if
it results in making use of an API which isn't used earlier by others.
Even if it's possible to record info info in this driver, still page reference
count needs to be released to free it otherwise the page is gone.

>
> because:
>
> (a) You have the information you need at allocation time, but you've
> failed to record that and are trying to use the IOMMU API to
> reconstruct the CPU virtual address

That's exactly what I have explained in the commit message, i.e why
I cannot record info at the time of allocation. Also, HW gives address of
the buffer (IOVA or physcial) where it has DMA'ed the packet and not an
index into buffer ring. There is one single buffer ring for 8 receive queues,
so there is no way to do a mapping btw DMA address at receive queue to
recorded info in buffer ring.

All you said is possible and that is exactly what I would have done if HW
gives me an index into buffer ring instead of DMA'ed address and I wouldn't
have been hit so hard with all the bottlenecks in ARM IOMMU infrastructure.

Thanks,
Sunil.
0 new messages