This CL uses the decompose operation to emulate the gelu for DirectML
backend when the DML_FEATURE_LEVEL < 5_1. PTAL, thanks!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
const NodeOutput* input =
GetNodeOutputForOperand(id_to_node_output_map, gelu->input_operand_id);
const TensorDesc& input_tensor_desc = input->GetTensorDesc();
Declare the variables close to where they are used.
DML_TENSOR_DATA_TYPE input_data_type = input_tensor_desc.GetDataType();
If we use `GetTensorDataType(data_type)`, this might be unnecessary?
const TensorDesc sqrt_output_tensor_desc = TensorDesc(input_data_type, {1});
`GetTensorDataType(data_type)`?
const TensorDesc sqrt_output_tensor_desc = TensorDesc(input_data_type, {1});
`/*dimensions*/ {1}`
"to create mutiply input operator."));
Remove "input", "multiply operator" is fine.
"to create mutiply input operator."));
typo: multiply
"For the emulation of gelu: failed to create mutiply "
Typo
"constant operator."));
ditto, remove "constant"
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Updated the code to reach the comments and updated the WPT baseline results. Please take another look, thanks a lot!
const NodeOutput* input =
GetNodeOutputForOperand(id_to_node_output_map, gelu->input_operand_id);
const TensorDesc& input_tensor_desc = input->GetTensorDesc();
Declare the variables close to where they are used.
Done
DML_TENSOR_DATA_TYPE input_data_type = input_tensor_desc.GetDataType();
If we use `GetTensorDataType(data_type)`, this might be unnecessary?
Done
const TensorDesc sqrt_output_tensor_desc = TensorDesc(input_data_type, {1});
Bin Miao`/*dimensions*/ {1}`
Done
const TensorDesc sqrt_output_tensor_desc = TensorDesc(input_data_type, {1});
Bin Miao`GetTensorDataType(data_type)`?
Done
Remove "input", "multiply operator" is fine.
Done
"to create mutiply input operator."));
Bin Miaotypo: multiply
Done
"For the emulation of gelu: failed to create mutiply "
Bin MiaoTypo
Done
"constant operator."));
Bin Miaoditto, remove "constant"
Done
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
lgtm % other reviewer's comments
Adding @rafael....@microsoft.com and @dwa...@microsoft.com for review. Thanks!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Small comment, else LGTM Bin. Thanks for adding the emulation path to avoid failures on Win11 21H2 & 22H2.
const TensorDesc second_mul_output_tensor_desc =
TensorDesc(GetTensorDataType(data_type), /*dimensions*/ input_dimensions);
Does just `const TensorDesc second_mul_output_tensor_desc = input_tensor_desc;` work here? (or maybe even just a reference rename like `const TensorDesc& second_mul_output_tensor_desc = input_tensor_desc;` to avoid the copy entirely) Except for the broadcasted constants, every one of these elementwise tensor descs has the same data type and dimensions, and since there is a lot of tediousness here and potential for little typos, I'm looking for ways to reduce the length for maintainers.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Optimized the code and add a test case for gelu WPT conformance test to check result when input is a constant. Since the WPT conformance test already covers the gelu operator test in services unittests, the gelu operator part is deleted. Please take another look, thanks a lot!
const TensorDesc second_mul_output_tensor_desc =
TensorDesc(GetTensorDataType(data_type), /*dimensions*/ input_dimensions);
Does just `const TensorDesc second_mul_output_tensor_desc = input_tensor_desc;` work here? (or maybe even just a reference rename like `const TensorDesc& second_mul_output_tensor_desc = input_tensor_desc;` to avoid the copy entirely) Except for the broadcasted constants, every one of these elementwise tensor descs has the same data type and dimensions, and since there is a lot of tediousness here and potential for little typos, I'm looking for ways to reduce the length for maintainers.
Thank you for your comments and suggestion and this can indeed simplify the code and reduce copy. BTW, we need to use `const TensorDesc& xxx_output_tensor_desc = output_tensor_desc;` to avoid the case where the input is a constant.
Also add a test case to WPT to check if the input is a constant and please take another look.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
👍
const TensorDesc second_mul_output_tensor_desc =
TensorDesc(GetTensorDataType(data_type), /*dimensions*/ input_dimensions);
Bin MiaoDoes just `const TensorDesc second_mul_output_tensor_desc = input_tensor_desc;` work here? (or maybe even just a reference rename like `const TensorDesc& second_mul_output_tensor_desc = input_tensor_desc;` to avoid the copy entirely) Except for the broadcasted constants, every one of these elementwise tensor descs has the same data type and dimensions, and since there is a lot of tediousness here and potential for little typos, I'm looking for ways to reduce the length for maintainers.
Thank you for your comments and suggestion and this can indeed simplify the code and reduce copy. BTW, we need to use `const TensorDesc& xxx_output_tensor_desc = output_tensor_desc;` to avoid the case where the input is a constant.
Also add a test case to WPT to check if the input is a constant and please take another look.
Updated the WPT baseline results and the CL passed all the tests. Please take another look @rafael....@microsoft.com @ningx...@intel.com thanks a lot!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Exportable changes to web-platform-tests were detected in this CL and a pull request in the upstream repo has been made: https://github.com/web-platform-tests/wpt/pull/46971.
When this CL lands, the bot will automatically merge the PR on GitHub if the required GitHub checks pass; otherwise, ecosystem-infra@ team will triage the failures and may contact you.
WPT Export docs:
https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md#Automatic-export-process
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
WebNN: Emulate gelu with decomposed operations
This CL uses the decompose operation to emulate the gelu for DirectML
backend when the DirectML feature level is lower than 5.1.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
The WPT PR for this CL has been merged upstream! https://github.com/web-platform-tests/wpt/pull/46971
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |