Alternatively, I could create a render target surface which can't be
used as a texture, but could be mapped for reading by the CPU.
The way I read the DirectX 11 documentation, the latter option is no
longer available. I know I can create a separate texture with
D3D11_USAGE_STAGING and copy a render target texture into it, then map
it for reading by the CPU, but I find the extra copy operation
annoying -- even though I don't have empirical evidence that it causes
performance problems for my use case.
Or am I missing something?
Thanks,
Emil Dotchevski
Reverge Studios, Inc.
http://www.revergestudios.com/reblog/index.php?n=ReCode
On most hardware, you can't efficiently get access to that data on the GPU.
In Direct3D 10.x and 11, you explicitly do this yourself by copying it to a
staging resources. Note that read back from the GPU is not buffered, so you
should do your own double-buffering to avoid stalls.
In Direct3D 9 this was hidden by the API doing the copies for you, which had
unpredictable performance impacts.
--
--
-Chuck Walbourn
Senior SDE, Windows Gaming Experience
http://blogs.msdn.com/chuckw/
This posting is provided "AS IS" with no warranties, and confers no rights.
Chuck, thanks for your reply.
I understand that on most hardware that is the case for _textures_. In
D3D 9 this was the case also: if you create a texture with the
D3DUSAGE_RENDERTARGET flag, you could bind it as a texture or as a
render target but you couldn't map it for read access by the CPU; you
had to copy it to another surface. In terms of functionality, D3D 9
and 11 are the same in this case.
However, in D3D 9 you could call CreateRenderTarget, which returns a
IDirect3DSurface9 object which is _not_ a texture (in D3D 11 terms,
you can't make a ShaderResourceView from it) but can be used as a
render target and can be locked for read access by the CPU (see
http://msdn.microsoft.com/en-us/library/bb174361%28VS.85%29.aspx.)
Are you sure that this results in implicit copying on most hardware?
Is there no support for something similar in D3D 11?
In Direct3D 10.x and 11, copying to a staging resource is the only way to do
GPU readback, and you should double buffer the staging resources when doing
readback to avoid stalling the pipeline. This is how it is actually
implemented in Direct3D 9 'under the covers' in any case.
You should check out the Gamefest 2007 talk "Windows to Reality: Getting the
Most out of Direct3D 10 Graphics in Your Games". It is still the best single
place to get performance advice for DirectX 10 and it all applies to DirectX
11.
Banning applications from being able to map a render target for read
access by the CPU can be costly. The fact that CUDA does allow this
usage pattern indicates that this D3D10/11 limitation is not based on
either driver API limitations or hardware limitations, at least on
NVIDIA hardware. No API magic is needed.
The RenderTargetView and ShaderResourceView framework provides
efficient support for use cases where the output from one GPU pass is
then bound as input for another GPU pass.
The D3D11_USAGE_DYNAMIC takes care of the use case where CPU writes
data that is then bound as input for a GPU pass.
The last use case, of providing CPU read-only access of data written
by the GPU appears to be inefficiently supported by DX10/11 while it
was efficiently supported (at least at API level) by D3D9. IMO this is
an error in the DX10/11 design that should be fixed. The more flexible
D3D 10/11 design makes this functionality even more necessary because
now the GPU can more easily be used to do more work than just
rendering.