Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

NVIDIA FreeBSD kernel feature requests

8 views
Skip to first unread message

Christian Zander

unread,
Jun 29, 2006, 7:18:04 AM6/29/06
to freebsd...@freebsd.org
Hi all,

NVIDIA has been looking at ways to improve its graphics driver for the
FreeBSD i386 platform, as well as investigating the possibility of adding
support for the FreeBSD amd64 platform, and identified a number of
obstacles. Some progress has been made to resolve them, and NVIDIA would
like to summarize the current status. We would also like to thank John
Baldwin and Doug Rabson for their valuable help.

This summary makes an attempt to describe the kernel interfaces needed by
the NVIDIA FreeBSD i386 graphics driver to achieve feature parity with
the Linux/Solaris graphics drivers, and/or required to make support for
the FreeBSD amd64 platform feasible. It also describes some of the
technical difficulties encountered by NVIDIA during the FreeBSD i386
graphics driver's development, how these problems have been worked around
and what could be done to solve them better.

While the following is focused on the NVIDIA FreeBSD graphics drivers, we
believe the interfaces discussed below are generally applicable to any
modern high performance graphics driver.

The interfaces in question can be loosely categorized into the different
classes reliability, compatibility and performance:

Reliability:

The NVIDIA graphics driver needs to be able to create uncached kernel
and user mappings of I/O memory, such as NVIDIA GPU registers. The
FreeBSD kernel does not currently provide the interfaces necessary to
specify the memory type when creating such mappings, which makes it
difficult for the NVIDIA graphics driver to guarantee that the correct
memory type is selected.

Kernel mappings of I/O memory can be created with the pmap_mapdev()
interface, user mappings are created with mmap(2). On FreeBSD i386 and
on FreeBSD amd64, the effective memory type of mappings created with
either interface is determined by a given system's MTRR configuration
by default, which will specify the correct UC memory type in most, but
not in all cases.

MTRR configurations with non-UC memory ranges overlapping I/O memory
mapped via pmap_mapdev() or mmap(2) can result in the incorrect memory
type being selected, which can impair reliability.

To reduce the likelihood of problems, the FreeBSD i386 driver updates
the mappings returned by pmap_mapdev() with the PCD/PWT flags to force
use of the UC memory type. On FreeBSD amd64, the presence of a large
static mapping using 2MB pages makes this approach unfeasible.

In the case of user mappings, limited control over the memory type can
be exerted with the help of MTRRs, but their lack of flexibility
greatly reduces the feasibility of this approach.

1) The NVIDIA FreeBSD graphics driver is in need of new a interface that
supports the creation of UC kernel mappings on FreeBSD i386 and on
FreeBSD amd64.

John Baldwin is working on a new interface, pmap_mapdev_attr(), which
will allow the NVIDIA graphics driver to create UC kernel mappings
on FreeBSD i386 and on FreeBSD amd64; the implementation on the latter
platform will handle the direct mapping transparently.

2) As described above, user mappings of I/O memory are created via the
mmap(2) interface and the FreeBSD device pager; unfortunately, drivers
do not currently have control over the memory type used.

The NVIDIA FreeBSD graphics driver needs to be able to specify the
memory type used for user mappings created via mmap(2). This interface
is also important for high performance graphics (see 'Performance'
below).

Compatibility:

1) The NVIDIA graphics driver needs to be able to set the memory type of
the kernel mapping of memory allocated with malloc()/contigmalloc()
to UC, which presents essentially the same problems as those outlined
above for I/O memory mappings.

The ability to change the memory type is necessary to avoid aliasing
problems when the memory is mapped into the AGP aperture, which is
accessed via WC user mappings. If the creation of UC/WC user mappings
becomes possible for system memory in the future (see below), the
ability to change the memory type of the associated kernel mappings to
UC will be important for the same reason.

Newer NVIDIA FreeBSD i386 graphics drivers manually update the memory
type of the kernel mappings of malloc() allocated memory using the
approach described for kernel mappings above. This is not feasible on
FreeBSD amd64 due to the static direct mapping (see above).

The NVIDIA FreeBSD graphics driver needs an interface that allows it
to change the memory type of the kernel mapping(s) of system memory
allocated with malloc()/contigmalloc(). The interface should flush CPU
and TLB caches, when necessary.

John Baldwin is working on pmap_change_attr() for FreeBSD i386 and for
FreeBSD amd64, which will allow specifying the desired memory types
for kernel mappings created with e.g. malloc()/contigmalloc().

2) The NVIDIA graphics driver needs to map different types of memory into
the address spaces of user clients, most commonly:

a) NVIDIA graphics device registers
b) NVIDIA graphics device frame buffer memory
c) AGP memory allocations (mapped via the AGP aperture)
d) DMA system memory allocations

This is currently done via mmap(2) and the device pager, i.e. the user
client performs a private ioctl(2) to allocate memory (this step is
specific to the b) - d) memory types), then calls mmap(2) to obtain a
user mapping of the memory. The NVIDIA graphics driver's d_mmap()
callback is invoked first to check the logical mmap(2) offset(s), then
again to return the associated page frame number(s) when the mapping
is accessed for the first time.

The device pager mechanism works well for a) - c), but not for d). The
system memory allocations are frequently very large (several MB) and
need to be allocated physically non-contiguous. This leads to problems
with the d_mmap() interface:

- d_mmap() is called per page with logical offsets computed based on
the mmap(2) base offset provided by the client and the current
page's position within the allocation, but no context information
is provided to d_mmap(). The NVIDIA FreeBSD graphics driver can
look up the associated system memory allocation and determine the
page frame number(s) for a given logical offset only if a linear
address range is associated with each system memory allocation, in
which case the start address can serve as the mmap(2) offset used
by the client and the logical offsets can be compared with each
allocation's linear address range.

Since the memory itself is not physically contiguous, the physical
addresses of pages in the allocation can not be used as mmap(2)
offsets, a different address range needs to be used. The FreeBSD
i386 driver currently allocates its system memory with malloc() and
derives the address range used with mmap(2) from the allocation's
kernel virtual address range.

This allocation of DMA system memory with malloc() is problematic
on FreeBSD i386 PAE and FreeBSD amd64 systems with more than 4GB of
RAM and older NVIDIA GPUs limited to 32-bit DMA, since malloc()
doesn't currently allow drivers to specify allocation constraints,
like contigmalloc() does, i.e. it may allocate physical memory that
can not be addressed by such GPUs.

Further, since the physical addresses of non-contiguous allocations
can not be used as mmap(2) offsets for system memory, but need to
be used for a) - c), the logical and physical addresses used as
mmap(2) offsets can potentially be confused by d_mmap(). The NVIDIA
graphics driver tries to minimize this risk, but can not avoid it
completely without a significant performance penalty.

- The device pager was designed for I/O memory regions and it assumes
that d_mmap() will always return the same page frame number for a
given logical offset. As a result, d_mmap() is invoked exactly once
for any given logical offset by default. In case of system memory
allocations, however, the physical page backing a given offset may
change as the malloc()'d memory is freed/reallocated.

The NVIDIA FreeBSD graphics driver needs to manually invalidate the
translation cache to work around this problem. It does so with the
msync() system call, which was extended for this purpose in FreeBSD
4.7 and again in FreeBSD 4.9 and 5.2.1. This leads to performance
problems on some configurations.

The NVIDIA FreeBSD graphics driver needs a different interface to make
the mapping of system memory allocations via mmap(2) simpler. If the
d_mmap() callback was extended to be called with the base offset in
addition to the current offset, the first two of the problems detailed
above would no longer be an issue; the NVIDIA graphics driver would
then be able to use physical addresses as mmap(2) offsets for a) - d).

The new interface may not require a FreeBSD specific ioctl(2), as this
would break compatibility with the NVIDIA Linux OpenGL library used
in the FreeBSD Linux ABI compatibility environment.

3) To be able to support FreeBSD i386 PAE and FreeBSD amd64 systems with
more than 4GB of physical memory and NVIDIA GPUs that are limited to
32-bit DMA, the NVIDIA FreeBSD graphics driver will need to be updated
to allocate memory from within the first 4GB of memory.

Unfortunately, this is not feasible with the current interfaces. The
malloc() interface does not allow the caller to specify allocation
constraints and while contigmalloc() does, its usefulness is currently
limited. This is because DMA memory can't realistically be allocated
contiguously, except if the allocations are very small, and because
a contiguous address range is needed for mmap(2), as described above,
which would need to be maintained seperately for contigmalloc() memory
allocations.

The introduction of an malloc() variant that allows the specification
of allocation constraints would solve the addressing problem, but
due to the problems caused by using logical and physical addresses for
mmap(2), a different solution would be preferred. By making it
possible to use physical addresses exclusively as mmap(2) offsets, as
described above, the NVIDIA FreeBSD graphics driver could use the
contigmalloc() interface to allocate the invidiual pages in the larger
non-contiguous allocations.

If contigmalloc() were used, the NVIDIA FreeBSD graphics driver would
need to be able to create contiguous virtual mappings spanning more
than one page within larger virtually non-contiguous allocations; this
functionality had best be implemented in the FreeBSD kernel.

The 'vmap()' kernel interface does this on Linux. It takes an array of
pages and maps them into a single contiguous address range.

Performance:

1) For optimal PCI-E performance and improved compatibility with systems
where MTRR memory ranges do not provide sufficient flexibility, the
NVIDIA FreeBSD graphics driver needs to be able to specify the memory
type used for user mappings created with mmap(2).

John Baldwin is working on PAT support for FreeBSD, which will be used
by the pmap_mapdev_attr() and pmap_change_attr() kernel interfaces
referred to above. This support can provide the desired flexibility if
the d_mmap() interface is extended or complemented with a new one,
allowing drivers to take advantage of the PAT support.

In order to provide optimal PCI-E performance, NVIDIA FreeBSD graphics
drivers need to be able to create WC system memory mappings.

2) The device pager mechanism is page fault based, which incurs noticable
overhead due to the large number of user/kernel context switches.
This can result in significant performance penalties with very large
or numerous kernel mappings. It also currently requires the use of the
msync() workaround (see above), which incurs additional overhead.

Performance with the NVIDIA FreeBSD graphics driver would benefit from
an mmap(2) interface that is independent of the device pager and
allows the mappings' page tables to be prebuilt. The Linux and Solaris
operating systems support such interfaces.

3) On Linux and Solaris, the NVIDIA graphics driver can maintain per open
instance data, i.e. data that is specific to the processes' file
descriptors associated with NVIDIA character special files. This is
useful primarily to achieve optimal results with the driver's internal
notification mechanism, which is used to implement Sync-to-VBlank
functionality, among other things. On these two operating systems, the
NVIDIA graphics driver can selectively wake threads select(2)'ing the
device files (/dev/nvidia0..N).

The NVIDIA FreeBSD graphics driver can only maintain per device state
at the moment. It wakes all processes waiting on /dev/nvidiaX, and
needs to traverse a per device event list for each of these processes
to check whether an event was delivered for each one of them, which
incurs some overhead. The logic also can't currently guarantee correct
delivery of events to different threads in the same process.

Future versions of the NVIDIA FreeBSD graphics driver are likely to
employ the notification mechanism more aggressively, to better support
composited X desktop functionality.


Summary of Tasks:

# Task: implement pmap_mapdev_attr() on FreeBSD i386 and on
FreeBSD amd64.
Motivation: allows reliable creation of kernel mappings of I/O
memory with specific cache attributes (with per-page
granularity).
Priority: gates FreeBSD amd64 support.
Status: is being implemented for i386 and amd64 (work is being
done to allow easily breaking down 2MB pages).


# Task: design/implement better mmap(2) mechanism for mapping
memory to user space (context information, cache
attributes).
Motivation: allows reliable creation of user mappings of DMA and
I/O memory and support for systems with more than
4GB of RAM.
Priority: gates improved FreeBSD i386 support (PCI-E performance,
SLI support, improved reliability); gates FreeBSD
amd64 support.
Status: has not been started, pending.


# Task: implement pmap_change_attr() on FreeBSD i386 and on
FreeBSD amd64.
Motivation: allows prevention of cache coherency problems.
Priority: gates FreeBSD amd64 support.
Status: is being implemented for i386 and amd64.


# Task: implement vmap()-like kernel interface.
Motivation: allows creation of contiguous kernel mappings of
parts of or complete non-contiguous DMA/system memory
allocations.
Priority: gates support for systems with more than 4GB of RAM.
Status: has not been started.


# Task: implement mechanism to allow character drivers to
maintain per-open instance data (e.g. like the Linux
kernel's 'struct file *').
Motivation: allows per thread NVIDIA notification delivery; also
reduces CPU overhead for notification delivery
from the NVIDIA kernel module to the X driver and to
OpenGL.
Priority: should translate to improved X/OpenGL performance.
Status: has not been started.

Thanks,

--
christian zander
ch?zan...@nvidia.com
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hacke...@freebsd.org"

Oleksandr Tymoshenko

unread,
Jun 29, 2006, 12:21:28 PM6/29/06
to Christian Zander, freebsd...@freebsd.org
Christian Zander wrote:
> Hi all,

> # Task: implement mechanism to allow character drivers to
> maintain per-open instance data (e.g. like the Linux
> kernel's 'struct file *').
> Motivation: allows per thread NVIDIA notification delivery; also
> reduces CPU overhead for notification delivery
> from the NVIDIA kernel module to the X driver and to
> OpenGL.
> Priority: should translate to improved X/OpenGL performance.
> Status: has not been started.
I've stumbled across this issue a while ago. Actually it can
be partially solved using EVENTHANDLER_REGISTER of dev_clone event with
keeping state structure in si_drv1 or si_drv2 fields. I'm not sure it's
the best solution but it works for me though it smells like hack, and
looks like hack :) Anyway, having legitimate per-open instance data
structures of cdevs is a great assistance in porting linux drivers to
FreeBSD. Just my $0.02.

--
Sincerely,

Oleksandr Tymoshenko
PBXpress Communications, Inc.
http://www.pbxpress.com
Tel./Fax.: +1 866 SIP PBX1 Ext. 656

Kip Macy

unread,
Jun 29, 2006, 12:34:58 PM6/29/06
to Oleksandr Tymoshenko, freebsd...@freebsd.org, Christian Zander
IIRC lack of per instance cdevs also limits Freebsd to one vmware instance.

-Kip

Alexander Kabaev

unread,
Jun 29, 2006, 1:30:45 PM6/29/06
to Oleksandr Tymoshenko, freebsd...@freebsd.org, Christian Zander
On Thu, Jun 29, 2006 at 09:32:42AM -0700, Kip Macy wrote:
> IIRC lack of per instance cdevs also limits Freebsd to one vmware instance.
>
> -Kip
>
> On 6/29/06, Oleksandr Tymoshenko <go...@pbxpress.com> wrote:
> >Christian Zander wrote:
> >> Hi all,
> >> # Task: implement mechanism to allow character drivers to
> >> maintain per-open instance data (e.g. like the Linux
> >> kernel's 'struct file *').
> >> Motivation: allows per thread NVIDIA notification delivery; also
> >> reduces CPU overhead for notification delivery
> >> from the NVIDIA kernel module to the X driver and to
> >> OpenGL.
> >> Priority: should translate to improved X/OpenGL performance.
> >> Status: has not been started.
> > I've stumbled across this issue a while ago. Actually it can
> >be partially solved using EVENTHANDLER_REGISTER of dev_clone event with
> >keeping state structure in si_drv1 or si_drv2 fields. I'm not sure it's
> >the best solution but it works for me though it smells like hack, and
> >looks like hack :) Anyway, having legitimate per-open instance data
> >structures of cdevs is a great assistance in porting linux drivers to
> >FreeBSD. Just my $0.02.
> >

WHY it smells like a hack? It was designed precisely to do that. I am
using cloned devices in our product with great success. Every client
opening 'magic' device gets its own exclusive cloned device instance
and everything works like a charm. I am yet to hear any single coherent
description of what Linux's approach has over device cloning in FreeBSD.
I wouldn't mind being educated on this.

--
Alexander Kabaev

Christian Zander

unread,
Jun 29, 2006, 1:47:48 PM6/29/06
to Alexander Kabaev, freebsd...@freebsd.org, Oleksandr Tymoshenko, Christian Zander

Thanks for your feedback, I hadn't been aware of this interface, but
it sounds promising. When was it first introduced? Are there any
known problems with it and certain FreeBSD releases, or is it expected
to work fine in FreeBSD >= 5.3?

Thanks,


> --
> Alexander Kabaev

--
christian zander
ch?zan...@nvidia.com

Oleksandr Tymoshenko

unread,
Jun 29, 2006, 2:20:39 PM6/29/06
to Alexander Kabaev, freebsd...@freebsd.org, Christian Zander
Alexander Kabaev wrote:
> WHY it smells like a hack? It was designed precisely to do that. I am
> using cloned devices in our product with great success. Every client
> opening 'magic' device gets its own exclusive cloned device instance
> and everything works like a charm. I am yet to hear any single coherent
> description of what Linux's approach has over device cloning in FreeBSD.
> I wouldn't mind being educated on this.
OK, it's a lack of my knowledge. It seemed a bit unnatural to me
to create device nodes instead of keeping a single pointer and I decided
it was supposed to do something other then keeping per-open instance.
It would be great to have this event/mechanism documented for I'd found
it looking through source code in /usr/src/sys. Not the worst place to
get information but man pages are better :)

--
Sincerely,

Oleksandr Tymoshenko
PBXpress Communications, Inc.
http://www.pbxpress.com
Tel./Fax.: +1 866 SIP PBX1 Ext. 656

Sam Leffler

unread,
Jun 29, 2006, 4:36:14 PM6/29/06
to Christian Zander, freebsd...@freebsd.org, Alexander Kabaev, Oleksandr Tymoshenko

It came in with devfs so it should be in all 5.x systems.

Sam

Doug Ambrisko

unread,
Jun 30, 2006, 4:42:16 PM6/30/06
to km...@fsmware.com, freebsd...@freebsd.org, Oleksandr Tymoshenko, Christian Zander
Kip Macy writes:
| IIRC lack of per instance cdevs also limits Freebsd to one vmware instance.

Really? Don't tell my vmware multiple instances! I used to run 10 on
one FreeBSD host.

Doug A.

Kip Macy

unread,
Jun 30, 2006, 9:20:13 PM6/30/06
to Doug Ambrisko, freebsd...@freebsd.org, Oleksandr Tymoshenko, Christian Zander
WOW THATS GREAT DOUG! \0/ - it didn't work for me.
-Kip

Doug Ambrisko

unread,
Jun 30, 2006, 10:17:14 PM6/30/06
to km...@fsmware.com, freebsd...@freebsd.org, Oleksandr Tymoshenko, Christian Zander
Kip Macy writes:
| WOW THATS GREAT DOUG! \0/ - it didn't work for me.

This was with the last patched driver for vmware 2. I'm not sure if
it every made it into the port.

http://www.mindspring.com/~vsilyaev/vmware/files/changes
28 Jan 01 Version 0.99-1-0.22
Support for multiple vmware sessions
Thnx to Luigi Rizzo
Support for bridged and host-only networking
Some fixes for -STABLE and -CURRENT

Looking at the port I don't see that it's been update to 99 yet.

Miguel Mendez

unread,
Jul 2, 2006, 5:50:58 AM7/2/06
to Christian Zander, freebsd...@freebsd.org
On Thu, 29 Jun 2006 13:12:31 +0200
Christian Zander <cza...@nvidia.com> wrote:

Hi,

I just saw an article on OSNews about this, seems I missed it.

> NVIDIA has been looking at ways to improve its graphics driver for the
> FreeBSD i386 platform, as well as investigating the possibility of adding
> support for the FreeBSD amd64 platform, and identified a number of
> obstacles. Some progress has been made to resolve them, and NVIDIA would

Yes, I'll tell you what the obstacle is: Lack of documentation. If you
guys released the specs of your hardware this wouldn't be a problem.
Maybe not for the latest GPUs but I'm sure a lot people would be happy
if they could use not-so-new NVidia hardware on FreeBSD/amd64. I built
and AMD64 box from scratch with the sole purpose of running
FreeBSD/amd64 on it. When it came time to choose the gfx card the choice
was obvious: Ati Radeon 9250.

I know that a lot of FreeBSDers are more than happy to have proprietary
drivers which I personlly won't touch with the proverbial 10 foot
pole :)

So please, do tell, is there any _real_ problem with releasing a
register spec doc for last year's hardware so amd64 users can hope to
have more than a framebuffer some day? How about the proprietary
nforce4 chipset?

Cheers.
--
Miguel Mendez <mme...@energyhq.be>
http://www.energyhq.be
PGP Key: 0xDC8514F1

Michal Mertl

unread,
Jul 2, 2006, 5:54:21 PM7/2/06
to Miguel Mendez, freebsd...@freebsd.org, Christian Zander
Miguel Mendez wrote:
> On Thu, 29 Jun 2006 13:12:31 +0200
> Christian Zander <cza...@nvidia.com> wrote:
>
> Hi,
>
> I just saw an article on OSNews about this, seems I missed it.
>
> > NVIDIA has been looking at ways to improve its graphics driver for the
> > FreeBSD i386 platform, as well as investigating the possibility of adding
> > support for the FreeBSD amd64 platform, and identified a number of
> > obstacles. Some progress has been made to resolve them, and NVIDIA would
>
> Yes, I'll tell you what the obstacle is: Lack of documentation. If you
> guys released the specs of your hardware this wouldn't be a problem.

I think that this reaction wasn't called for. Modern GPUs are
extraordinarily complex HW and to write a decent driver will take
appropriate effort. I understand that open source "infected" people
(like me) prefer having the detailed HW documentation but we shouldn't
refuse the vendor's efforts to provide good driver to us.

I haven't understood much of Mr. Zander's questions but I am pretty sure
some readers did and probably have been talking to him off-list. I also
tend to believe that his requests for features were based on good
understanding of FreeBSD kernel internals (better that mine and probably
also yours) and if we add the features or help him effectively use
what's there everyone will benefit.

> Maybe not for the latest GPUs but I'm sure a lot people would be happy
> if they could use not-so-new NVidia hardware on FreeBSD/amd64. I built
> and AMD64 box from scratch with the sole purpose of running
> FreeBSD/amd64 on it. When it came time to choose the gfx card the choice
> was obvious: Ati Radeon 9250.
>
> I know that a lot of FreeBSDers are more than happy to have proprietary
> drivers which I personlly won't touch with the proverbial 10 foot
> pole :)
>
> So please, do tell, is there any _real_ problem with releasing a
> register spec doc for last year's hardware so amd64 users can hope to
> have more than a framebuffer some day? How about the proprietary
> nforce4 chipset?

They well may have reasons not to disclose everything. There is probably
lot of their market success hidden in the full specs. I bet that Mr.
Zanders can't answer the question (and definitely can't decide to give
the specs) anyway.

And - again - it will probably take a couple of very skilled
programmers' years' time to write good driver from scratch.


>
> Cheers.

Sam Smith

unread,
Jul 2, 2006, 7:11:41 PM7/2/06
to Michal Mertl, freebsd...@freebsd.org, Christian Zander
On Sun, 2 Jul 2006, Michal Mertl wrote:
> And - again - it will probably take a couple of very skilled
> programmers' years' time to write good driver from scratch.

It took someone far less than that
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/if_nfe.c
http://www.openbsd.org/cgi-bin/man.cgi?query=nfe&sektion=4

Nvidia don't want to give out docs.

That's their commercial decision - as a result, if you
their cards, you may end up with a particularly expensive
paperweight the day they decide you need to buy a new card
for your new version of freebsd which has different
internals; or someone finds bugs in their drivers that
they wont fix. it's not like there aren't plenty of other
vendors who are more willing to help the developers with
documentation in an open manner.


Regards
Sam

--
Procrastination: Hard work often pays off after time.
Laziness pays off now

Joseph Koshy

unread,
Jul 2, 2006, 11:38:49 PM7/2/06
to Sam Smith, freebsd...@freebsd.org, Michal Mertl, Christian Zander
> That's their commercial decision - as a result, if you
> their cards, you may end up with a particularly expensive
> paperweight the day they decide you need to buy a new card
> for your new version of freebsd which has different
> internals; or someone finds bugs in their drivers that
> they wont fix.

This is the relatively benign scenario.

In the less benign one that "convenient" binary driver that
you loaded into the kernel would contain a silent security
vulnerability. Google for "Sony DRM rootkit".

> it's not like there aren't plenty of other
> vendors who are more willing to help the developers with
> documentation in an open manner.

True.

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy

M. Warner Losh

unread,
Jul 2, 2006, 11:49:29 PM7/2/06
to km...@fsmware.com, freebsd...@freebsd.org
In message: <b1fa29170606290932m419...@mail.gmail.com>
"Kip Macy" <kip....@gmail.com> writes:
: IIRC lack of per instance cdevs also limits Freebsd to one vmware instance.

Can you describe the proper semantics here? A cdev is a cdev, and
when we do things like dup we just copy the reference to that cdev.
This has also traditionally been resisted on layering violations
grounds (since the data we have doesn't map easily back to the fd at
the time we call the cdev methods).

Warner

:

M. Warner Losh

unread,
Jul 3, 2006, 12:04:27 AM7/3/06
to cza...@nvidia.com, freebsd...@freebsd.org
In message: <2006062911...@wolf.nvidia.com>
Christian Zander <cza...@nvidia.com> writes:
: This summary makes an attempt to describe the kernel interfaces needed by

: the NVIDIA FreeBSD i386 graphics driver to achieve feature parity with
: the Linux/Solaris graphics drivers, and/or required to make support for
: the FreeBSD amd64 platform feasible. It also describes some of the
: technical difficulties encountered by NVIDIA during the FreeBSD i386
: graphics driver's development, how these problems have been worked around
: and what could be done to solve them better.

Thank you for taking the time to let us know how we might make the
system better.

: The NVIDIA graphics driver needs to be able to create uncached kernel


: and user mappings of I/O memory, such as NVIDIA GPU registers. The
: FreeBSD kernel does not currently provide the interfaces necessary to
: specify the memory type when creating such mappings, which makes it
: difficult for the NVIDIA graphics driver to guarantee that the correct
: memory type is selected.

Is this via the bus_alloc_resource interface? Is uncached kernel
memory different than non-prefetchable memory? If so, please specify
how it is different. If not, then we have an interface that will do
what you want, except it is only implemented for cardbus and would
need to be implemented for pci pci and pci host bridges. Would having
better functionality here help? I noticed it wasn't on the task list...

Warner

Robert Watson

unread,
Jul 3, 2006, 5:44:29 AM7/3/06
to M. Warner Losh, freebsd...@freebsd.org, km...@fsmware.com

On Sun, 2 Jul 2006, M. Warner Losh wrote:

> In message: <b1fa29170606290932m419...@mail.gmail.com>
> "Kip Macy" <kip....@gmail.com> writes:
> : IIRC lack of per instance cdevs also limits Freebsd to one vmware instance.
>
> Can you describe the proper semantics here? A cdev is a cdev, and when we
> do things like dup we just copy the reference to that cdev. This has also
> traditionally been resisted on layering violations grounds (since the data
> we have doesn't map easily back to the fd at the time we call the cdev
> methods).

In the past, I've done some experimental implementation allowing devfs
providers to provide session cookies for the file descriptor. This is fairly
contrary to our VFS design, and our notions of "open" are a bit hazy -- for
example, there are a number of situations in which I/O occurs on vnodes
without the vnode being open. The devfs cloning model does offer significant
simplications in many cases, and certainly fits our VFS model a bit more
happily. It also exists today. It may be that our VMware kernel module
doesn't know about it yet, however.

Robert N M Watson
Computer Laboratory
University of Cambridge

Robert Watson

unread,
Jul 3, 2006, 5:46:16 AM7/3/06
to Sam Smith, freebsd...@freebsd.org, Michal Mertl, Christian Zander
On Sun, 2 Jul 2006, Sam Smith wrote:

> On Sun, 2 Jul 2006, Michal Mertl wrote:
>> And - again - it will probably take a couple of very skilled
>> programmers' years' time to write good driver from scratch.
>
> It took someone far less than that
> http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/if_nfe.c
> http://www.openbsd.org/cgi-bin/man.cgi?query=nfe&sektion=4
>
> Nvidia don't want to give out docs.
>
> That's their commercial decision - as a result, if you their cards, you may
> end up with a particularly expensive paperweight the day they decide you
> need to buy a new card for your new version of freebsd which has different
> internals; or someone finds bugs in their drivers that they wont fix. it's
> not like there aren't plenty of other vendors who are more willing to help
> the developers with documentation in an open manner.

As I've also pointed out privately, but figure the list might benefit from --
this is a discussion of NVIDIA's video hardware, not network hardware, and the
differences are significant:

Producing a device driver for a network interface is a pretty casual activity,
since network interfaces are often just glorified hardware fifos, and there is
relatively little that distinguishes most low-end cards on the market.

Producing a driver for a GPU card, especially one that possibly converts from
GL-foo to foo appropriate to program and feed an ASIC on a video card, is
quite different matter entirely.

I'm all for open source drivers, and would also encourage NVIDIA to continue
to reconsider their closed source driver approach where it makes sense
(especially for the network interfaces). However, I think that we shouldn't
conflate these two cases rhetorically, as there are orders of magnitude
complexity (and intellectual property) differences.

Robert N M Watson
Computer Laboratory
University of Cambridge

Christian Zander

unread,
Jul 3, 2006, 8:07:36 AM7/3/06
to M. Warner Losh, freebsd...@freebsd.org, cza...@nvidia.com
On Sun, Jul 02, 2006 at 10:02:17PM -0600, M. Warner Losh wrote:
> In message: <2006062911...@wolf.nvidia.com>
> Christian Zander <cza...@nvidia.com> writes:
> : This summary makes an attempt to describe the kernel interfaces needed by
> : the NVIDIA FreeBSD i386 graphics driver to achieve feature parity with
> : the Linux/Solaris graphics drivers, and/or required to make support for
> : the FreeBSD amd64 platform feasible. It also describes some of the
> : technical difficulties encountered by NVIDIA during the FreeBSD i386
> : graphics driver's development, how these problems have been worked around
> : and what could be done to solve them better.
>
> Thank you for taking the time to let us know how we might make the
> system better.
>
> : The NVIDIA graphics driver needs to be able to create uncached kernel
> : and user mappings of I/O memory, such as NVIDIA GPU registers. The
> : FreeBSD kernel does not currently provide the interfaces necessary to
> : specify the memory type when creating such mappings, which makes it
> : difficult for the NVIDIA graphics driver to guarantee that the correct
> : memory type is selected.
>
> Is this via the bus_alloc_resource interface? Is uncached kernel
> memory different than non-prefetchable memory? If so, please specify
> how it is different. If not, then we have an interface that will do
> what you want, except it is only implemented for cardbus and would
> need to be implemented for pci pci and pci host bridges. Would having
> better functionality here help? I noticed it wasn't on the task list...
>

The I/O memory in question is non-prefetchable. The NVIDIA FreeBSD
graphics driver currently uses the bus_alloc_resource() interface
without the RF_ACTIVE flag and then uses pmap_mapdev() to obtain
kernel mappings of the I/O memory, which it then updates with the
PCD/PWT flags to force them to be uncached. User mappings are created
via mmap(); they use the effective memory type derived from the MTRR
configuration. If you're interested in taking a look, the FreeBSD
kernel specific interface code is included with the NVIDIA graphics
driver.

John is working on pmap_mapdev_attr(), which is built on top of the
PAT support he is adding, and this interface will allow the caller to
request a specific memory type to use for the mapping, handling the
details transparently (e.g. the direct mapping on FreeBSD amd64). It
would probably be useful if the bus_alloc_resource() interface
supported this functionality, but the NVIDIA graphics driver would
still need to use the pmap_mapdev_attr() interface, e.g. for its AGP
GART driver.

The current plan is to replace pmap_mapdev() with pmap_mapdev_attr()
in the driver when the latter interface becomes available.

Thanks,


> Warner

--
christian zander
ch?zan...@nvidia.com

Christian Zander

unread,
Jul 3, 2006, 8:09:28 AM7/3/06
to Robert Watson, freebsd...@freebsd.org, km...@fsmware.com
On Mon, Jul 03, 2006 at 10:33:29AM +0100, Robert Watson wrote:
>
> On Sun, 2 Jul 2006, M. Warner Losh wrote:
>
> >In message: <b1fa29170606290932m419...@mail.gmail.com>
> > "Kip Macy" <kip....@gmail.com> writes:
> >: IIRC lack of per instance cdevs also limits Freebsd to one vmware
> >instance.
> >
> >Can you describe the proper semantics here? A cdev is a cdev, and when we
> >do things like dup we just copy the reference to that cdev. This has also
> >traditionally been resisted on layering violations grounds (since the data
> >we have doesn't map easily back to the fd at the time we call the cdev
> >methods).
>
> In the past, I've done some experimental implementation allowing devfs
> providers to provide session cookies for the file descriptor. This is
> fairly contrary to our VFS design, and our notions of "open" are a bit hazy
> -- for example, there are a number of situations in which I/O occurs on
> vnodes without the vnode being open. The devfs cloning model does offer
> significant simplications in many cases, and certainly fits our VFS model a
> bit more happily. It also exists today. It may be that our VMware kernel
> module doesn't know about it yet, however.
>

I've made a first pass at implementing support for the device cloning
mechanism in the NVIDIA FreeBSD graphics driver and it seems to work
well with the driver's notification mechanism on FreeBSD 5.3. I'll
need to do more testing and check what implications the mechanism has
(locking, etc.), but it looks like it's a good match.

Thanks,

--
christian zander
ch?zan...@nvidia.com

Avleen Vig

unread,
Jul 4, 2006, 2:10:41 AM7/4/06
to freebsd...@freebsd.org
On Mon, Jul 03, 2006 at 08:42:42AM +0530, Joseph Koshy wrote:
> In the less benign one that "convenient" binary driver that
> you loaded into the kernel would contain a silent security
> vulnerability. Google for "Sony DRM rootkit".

I think the difference here is that NVIDIA are a little more trusted
than Sony ever was :-)

Kip Macy

unread,
Jul 4, 2006, 2:29:06 AM7/4/06
to Robert Watson, Sam Smith, freebsd...@freebsd.org, Michal Mertl, Christian Zander
> Producing a driver for a GPU card, especially one that possibly converts from
> GL-foo to foo appropriate to program and feed an ASIC on a video card, is
> quite different matter entirely.
>
> I'm all for open source drivers, and would also encourage NVIDIA to continue
> to reconsider their closed source driver approach where it makes sense
> (especially for the network interfaces). However, I think that we shouldn't
> conflate these two cases rhetorically, as there are orders of magnitude
> complexity (and intellectual property) differences.

Furthermore, requesting needed changes to the kernel interfaces is
completely orthogonal to their documentation policies.

-Kip

M. Warner Losh

unread,
Jul 5, 2006, 2:10:04 AM7/5/06
to km...@fsmware.com, kip....@gmail.com, S...@msmith.net, freebsd...@freebsd.org, mi...@traveller.cz, rwa...@freebsd.org, cza...@nvidia.com
In message: <b1fa29170607032327i1e8...@mail.gmail.com>
"Kip Macy" <kip....@gmail.com> writes:
: > Producing a driver for a GPU card, especially one that possibly converts from

: > GL-foo to foo appropriate to program and feed an ASIC on a video card, is
: > quite different matter entirely.
: >
: > I'm all for open source drivers, and would also encourage NVIDIA to continue
: > to reconsider their closed source driver approach where it makes sense
: > (especially for the network interfaces). However, I think that we shouldn't
: > conflate these two cases rhetorically, as there are orders of magnitude
: > complexity (and intellectual property) differences.
:
: Furthermore, requesting needed changes to the kernel interfaces is
: completely orthogonal to their documentation policies.

It is well documented that NVIDIA gives you binary drivers. Other
vendors give you source or binary as they see fit. When you have a
choice, the type of driver may factor into what you buy. When you
don't have a choice (because, say, it is a built-in chip), the
availability of a binary-only driver vs no driver at all may save your
laptop or computer from being a paperweight.

As interesting as such choices are, they, as Kip points out, are
orthogonal to suggestions on how FreeBSD could be better. Even if
NVIDIA had release open source drivers for their current binary
drivers, the issues they took the trouble to document and writeup in a
clear, coherent fashion here would still be present. It is in the
best interest of the FreeBSD project to accept this valuable input and
evaluate it on its merits, rather than on our political judgment of
the messenger.

Warner

John Baldwin

unread,
Jul 5, 2006, 3:36:22 PM7/5/06
to freebsd...@freebsd.org, cza...@nvidia.com
On Monday 03 July 2006 00:02, M. Warner Losh wrote:
> In message: <2006062911...@wolf.nvidia.com>
> Christian Zander <cza...@nvidia.com> writes:
> : This summary makes an attempt to describe the kernel interfaces needed by
> : the NVIDIA FreeBSD i386 graphics driver to achieve feature parity with
> : the Linux/Solaris graphics drivers, and/or required to make support for
> : the FreeBSD amd64 platform feasible. It also describes some of the
> : technical difficulties encountered by NVIDIA during the FreeBSD i386
> : graphics driver's development, how these problems have been worked around
> : and what could be done to solve them better.
>
> Thank you for taking the time to let us know how we might make the
> system better.
>
> : The NVIDIA graphics driver needs to be able to create uncached kernel
> : and user mappings of I/O memory, such as NVIDIA GPU registers. The
> : FreeBSD kernel does not currently provide the interfaces necessary to
> : specify the memory type when creating such mappings, which makes it
> : difficult for the NVIDIA graphics driver to guarantee that the correct
> : memory type is selected.
>
> Is this via the bus_alloc_resource interface? Is uncached kernel
> memory different than non-prefetchable memory? If so, please specify
> how it is different. If not, then we have an interface that will do
> what you want, except it is only implemented for cardbus and would
> need to be implemented for pci pci and pci host bridges. Would having
> better functionality here help? I noticed it wasn't on the task list...

This isn't an issue of how the memory is mapped in the PCI-PCI bridge where
non-prefetchable is used to keep the bridge from prefetching things, but as
to how the memory is mapped in the CPU itself. Also, I've seen mention of
using bus_dma, etc. One of the problems is our current bus APIs have a very
limited view of caching "modes". E.g. here you mention overloading
non-prefetchable to get a UC mapping. In bus_dma(9) we have the COHERENT
flag to UC rather than a WB mapping. Neither of these API's allow for, say,
WC (Write-Combining) mappings. :) Other OS's such as Windows and OS X allow
you to explicitly specify what type of cache "mode" you want for a mapping.

--
John Baldwin

Sam Leffler

unread,
Jul 5, 2006, 7:31:37 PM7/5/06
to John Baldwin, freebsd...@freebsd.org, cza...@nvidia.com

As we've discussed privately, bus_dma is the right api for drivers to
use. If it doesn't do what he needs then we need to extend it. Drivers
should not be groveling around inside the vm system.

Sam

John Baldwin

unread,
Jul 5, 2006, 11:47:23 PM7/5/06
to Sam Leffler, freebsd...@freebsd.org, cza...@nvidia.com

I don't disagree, but my point is that the APIs do need extending. Look
at it this way. The current changes are to provide a way so nvidia can
call pmap_foo() functions rather than modifying PTE's and the PAT MSR's
directly. This is progress. :)

--
John Baldwin

Miguel Mendez

unread,
Jul 6, 2006, 3:56:35 PM7/6/06
to Michal Mertl, freebsd...@freebsd.org, cza...@nvidia.com
On Sun, 02 Jul 2006 23:51:35 +0200
Michal Mertl <mi...@traveller.cz> wrote:

Hi,

I'm going to reply to this, once, just to make my argument clear.

> I think that this reaction wasn't called for. Modern GPUs are
> extraordinarily complex HW and to write a decent driver will take
> appropriate effort. I understand that open source "infected" people
> (like me) prefer having the detailed HW documentation but we shouldn't
> refuse the vendor's efforts to provide good driver to us.

I agree modern GPUs are way more complex than the C64 VIC. However I
never asked for the source code of the driver. I ran one earlier
version through Siul+hacky's dasm and, believe me, you don't want that
code. NVidia's reason for not releasing the source code of their
proprietary OpenGL driver is that it contains licensed 3rd party code.
Let's buy that for a second. That still doesn't explain why the nforce
chipsets are undocumented. And that doesn't explain why there isn't
any register documentation for earlier GPUs. I mean, for Christ's sake,
if you're on a 6 month release cycle, who honestly cares about a card
that was released 2 years ago? But still no documentation.

You might be familiar with this quote: "Those who would give up
Essential Liberty to purchase a little Temporary Safety, deserve
neither Liberty nor Safety".

In these days many freenix users happily give up their freedom in order
to gain functinality.

It's funny when you think about it, NVidia started when some engineers
left SGI. SGI has contributed a lot to the free software community,
e.g. XFS, while NVidia doesn't seem to care at all. Considering that
they're a _hardware_ company I find it amusing that they refuse to help
people do work for them for free.

> I haven't understood much of Mr. Zander's questions but I am pretty sure
> some readers did and probably have been talking to him off-list. I also
> tend to believe that his requests for features were based on good
> understanding of FreeBSD kernel internals (better that mine and probably
> also yours) and if we add the features or help him effectively use
> what's there everyone will benefit.

And as I see it, NVidia is asking FreeBSD developers to invest man
hours, for free, so they can release a _proprietary_ driver for FreeBSD
which you can neither use on !x86/amd64 nor study. Sounds like a very
nice deal.

I know many among the BSD camp consider people like Richard Stallman
and Theo to be extremists when it comes to software freedom. Maybe
you'll change your mind next time you're trying to use NVidia hardware
on a macppc box.

Doug Barton

unread,
Jul 7, 2006, 1:06:20 AM7/7/06
to Miguel Mendez, freebsd...@freebsd.org, cza...@nvidia.com
I found your manifesto for free software interesting, although I didn't
really see anything new there so I snipped it. In my opinion, you can always
vote with your wallet, and if you feel that strongly about needing all the
parts of your system to be "free," well, knock yourself out.

On the other hand, I like nvidia graphics cards, they meet my needs, and I
appreciate their support of free software, particularly the brand of OS that
I choose to use. It is of no consequence to me that their idea of "free"
does not match up with what others believe to be "truly free."

However, what I did want to react to was this bit below ...

Miguel Mendez wrote:

> And as I see it, NVidia is asking FreeBSD developers to invest man
> hours, for free, so they can release a _proprietary_ driver for FreeBSD
> which you can neither use on !x86/amd64 nor study. Sounds like a very
> nice deal.

You may choose to see it how you wish, and you may also choose to try and
persuade others to see it your way. However, several developers (who are a
lot smarter than I when it comes to kernel stuff) have already spoken up to
say that at least some of the ideas presented would have value in other
areas if they were implemented, so at best I would deem your
characterization to be slanted. You may wish to consider if perhaps it is
not also inaccurate.

Doug

--

This .signature sanitized for your protection

0 new messages