Shader Model 3.0 Download Windows 7 32-bit

66 views
Skip to first unread message

Hanne Rylaarsdam

unread,
May 27, 2024, 12:39:14 AM5/27/24
to wardtefarab

These intrinsics are a required/supported feature of Shader model 6.4. Consequently, no separate capability bit check is required, beyond assuring the use of Shader Model 6.4. The minimum supported client for these routines is Windows 10, version 1903.

Shader Model 3.0 Download Windows 7 32-bit


Download Zip ☆☆☆☆☆ https://t.co/uHnCUE2Pce



A 4-dimensional unsigned integer dot-product with add. Multiplies together each corresponding pair of unsigned 8-bit int bytes in the two input DWORDs, and sums the results into the 32-bit unsigned integer accumulator. This instruction operates within a single 32-bit wide SIMD lane. The inputs are also assumed to be 32-bit quantities.

A 4-dimensional signed integer dot-product with add. Multiplies together each corresponding pair of signed 8-bit int bytes in the two input DWORDs, and sums the results into the 32-bit signed integer accumulator. This instruction operates within a single 32-bit wide SIMD lane. The inputs are also assumed to be 32-bit quantities.

A 2-dimensional floating point dot-product of half2 vectors with add. Multiplies the elements of the two half-precision float input vectors together and sums the results into the 32-bit float accumulator. This instructions operates within a single 32-bit wide SIMD lane. The inputs are 16-bit quantities packed into the same lane.

This system value is available on platforms that are D3D12_VARIABLE_SHADING_RATE_TIER_2 or higher. It can be written from at most one of vertex or geometry shader stages. It can be read from the pixel shader stage. For more information, see the Variable-rate Shading.

Vertex shaders and pixel shaders are simplified considerably from earlier shader versions. If you are implementing shaders in hardware, you may not use vs_3_0 or ps_3_0 with any other shader versions, and you may not use either shader type with the fixed function pipeline. These changes make it possible to simplify drivers and the runtime. The only exception is that software-only vs_3_0 shaders may be used with any pixel shader version. In addition, if you are using a software-only vs_3_0 shader with a previous pixel shader version, the vertex shader can only use output semantics that are compatible with flexible vertex format (FVF) codes.

The semantics used on vertex shader outputs must be used on pixel shader inputs. The semantics are used to map the vertex shader outputs to the pixel shader inputs, similar to the way the vertex declaration is mapped to the vertex shader input registers and previous shader models. See Match Semantics on vs 3.0 and ps 3.0 Shaders.

Additional wrap mode render states have been added to cover the possibility of additional texture coordinates in this new scheme. Attributes with D3DDECLUSAGE_TEXCOORD and usage index from 0 to 15 are interpolated in wrap mode when the corresponding D3DRS_WRAP* is set.

The vertex shader output register types have been collapsed into twelve registers (see Output Registers). Each register that is used needs to be declared using the dcl instruction and a semantic (for example, dcl_color0 o0.xyzw).

The 3_0 vertex shader model (vs_3_0) expands on the features of vs_2_0 with more powerful register indexing, a set of simplified output registers, the ability to sample a texture in a vertex shader, and the ability to control the rate at which shader inputs are initialized.

You must declare input and output registers before indexing them. However, you may not index any output register that has been declared with a position or point size semantic. In fact, if indexing is used the position and psize semantics have to be declared in the o0 and o1 registers respectively.

You are only allowed to index a continuous range of registers; that is, you cannot index across registers that have not been declared. While this restriction may be inconvenient, it permits hardware optimization to take place. Attempting to index across non-contiguous registers will produce undefined results. Shader validation does not enforce this restriction.

All the various types of output registers have been collapsed into twelve output registers: 1 for position, 2 for color, 8 for texture, and 1 for fog or point size. These registers will interpolate any data they contain for the pixel shader. Output register declarations are required and semantics are assigned to each register.

The pixel shader color and texture registers have been collapsed into ten input registers (see Input Register Types). The Face Register is a floating point scalar register. Only the sign of this register is valid. If the sign is negative the primitive is a back face. This can be used inside a pixel shader to achieve two-sided lighting, for instance. The Position Register references the current (x,y) pixels.

Similarly, a semantic name declared on different input registers in the pixel shader (v0 and v1 in the pixel shader) cannot be used in a single output register in this vertex shader. For instance, this vertex shader cannot be paired with the pixel shader because D3DDECLUSAGE_TEXCOORD1 is used for both pixel shader input registers (v0, v1) and the vertex shader output register o3.

On the other hand, this vertex shader cannot be paired with the pixel shader because the output mask for a parameter with a given semantic does not provide the data that is requested by the pixel shader:

When D3DRS_SHADEMODE is set for flat shading during clipping and triangle rasterization, attributes with D3DDECLUSAGE_COLOR are interpolated as flat shaded. If any components of a register are declared with a color semantic but other components of the same register are given different semantics, flat shading interpolation (linear vs. flat) will be undefined on the components in that register without a color semantic.

If fog rendering is desired, vs_3_0 and ps_3_0 shaders must implement fog. No fog calculations are done outside of the shaders. There is no fog register in vs_3_0, and additional semantics D3DDECLUSAGE_FOG (for fog blend factor computed per vertex) and D3DDECLUSAGE_DEPTH (for passing in a depth value to the pixel shader to compute the fog blend factor) have been added.

Floating point math happens at different precision and ranges (16-bit, 24-bit, and 32-bit) in different parts of the pipeline. A value greater than the dynamic range of the pipeline that enters that pipeline (for example, a 32-bit float texture map is sampled into a 24-bit float pipeline in ps_2_0) creates an undefined result. For predictable behavior, you should clamp such a value to the dynamic range maximum.

Partial precision (see Pixel Shader Register Modifiers) is requested by adding the _pp modifier to shader code (provided that the underlying implementation supports it). Implementations are always free to ignore the modifier and perform the affected operations in full precision.

In all these cases the developer may choose to specify partial precision to process the data, knowing that no input data precision is lost. In some cases, a shader may require that the internal steps of a calculation be performed at full precision even when input and final output values do not have more than partial precision.

Software implementations (run-time and reference for vertex shaders and reference for pixel shaders) of version 2_0 shaders and above have some validation relaxed. This is useful for debugging and prototyping purposes. The application indicates to the runtime/assembler that it needs some of the validation relaxed using the _sw flag in the assembler (for example, vs_2_sw). A software shader will not work with hardware.

Previous gather operations grant the ability to retrieve a single channel of the sampled elements. Because the operations were limited to a single channel, retrieving all the channels of an element would require multiple gathers. Additionally, implicit conversions and other processing is done on these elements that the programmer has no control over.

Raw gathers give the control to the author by retrieving the raw element data including all channels without any conversion. Like other gathers, they draw from the elements that have been sampled with bilinear filtering using Sample. The elements are retrieved in the form of unsigned integers of sizes matching the size of the full elements with channels packed in as specified for the underlying format.

For example, a R32_UINT format resource view could be created for a R8G8B8A8 texture and then within the shader the R32_UINT resource view could then be raw gathered into four 32-bit unsigned integers that represent the raw representation of the R8G8B8A8 data. The author is then able to use that data however they wish.

The elements variable will then contain the four elements sample as determined by the sampler state and uv location packed into the 32 bit integers as four 8-bit values representing the RGBA channels. 2D texture arrays can also be used. Where 16-bit and 64-bit integers are supported, formats of those sizes can be cast to the corresponding integer views and raw gathered as well using similar shader code. The number of channels and their layout depends on the underlying resource format.

Raw Gathers require Enhanced Barriers for the full set of formats the feature has been designed to support; see our preview Agility SDK to also get access to Enhanced Barriers. Without Enhanced Barriers and Relaxed Format Casting support, only the following formats that could previously be cast to uint views will be castable:

Existing sample and load operations require their offsets to be immediate integers. Programmers had to decide on the offset values they wanted prior even to shader compile time. To say the least, this made them of limited use.

Shader Model 6.7 frees offset arguments to the full suite of sample and load operations to be variable values just as they can be in gather operations. The effective range remains [-8,7], by respecting only the 4 least significant bits of the provided offset values. The full list of affected resource methods:

Previously, to perform same and compare operation, you could either use the default SampleCmp, which used a MIP level determined by the location gradients or access the zero level using SampleCmpLevelZero. This left an obvious gap in functionality where using smaller MIP levels explicitly indexed would be useful. Shader Model 6.7 adds SampleCmpLevel which simply allows you to specify the level you want to sample and compare to:

a3c65b3c4b
Reply all
Reply to author
Forward
0 new messages