Set Ready For Review
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (gemm.c_operand_id && gemm.beta != 0.0f) {Can we refine `SerializeLinearOperation` to avoid mul operation, that can be reused here and L.5105.
return SerializeBinaryOperation(::tflite::BuiltinOperator_ADD,Maybe `BROADCAST_TO` operator that fills the output with zeros better than add operation.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Uploaded PS11 to address the comments. Please take another look. Thanks!
if (gemm.c_operand_id && gemm.beta != 0.0f) {Can we refine `SerializeLinearOperation` to avoid mul operation, that can be reused here and L.5105.
Good suggestion! Refined `SerializeLinearOperation` in latest PS11.
return SerializeBinaryOperation(::tflite::BuiltinOperator_ADD,Maybe `BROADCAST_TO` operator that fills the output with zeros better than add operation.
Added a new method `SerializeBroadcastToOperation` to use `BROADCAST_TO`. Meanwhile, Refactor the `SerializeExpand` function to reuse the `SerializeBroadcastToOperation` function.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ASSIGN_OR_RETURN(const TensorIndex constant_tensor_index,Nit: keep naming consistent
```suggestion
ASSIGN_OR_RETURN(const TensorIndex beta_tensor_index,
```
return SerializeBroadcastToOperation(input_tensor_index, input_dimensions,
output_tensor_index);Broadcasting seems to be unnecessary, using Identity instead?
if (alpha == 0.0f) {
ASSIGN_OR_RETURN(const TensorIndex constant_tensor_index,
SerializeTensorWithBuffer<float>(
/*buffer=*/std::array<float, 1>{beta},
/*dimensions=*/{}));
return SerializeBroadcastToOperation(constant_tensor_index,
input_dimensions, output_tensor_index);
}
if (alpha == 1.0f) {
if (beta == 0.0f) {
return SerializeBroadcastToOperation(input_tensor_index, input_dimensions,
output_tensor_index);
}
ASSIGN_OR_RETURN(const TensorIndex beta_tensor_index,
SerializeTensorWithBuffer<float>(
/*buffer=*/std::array<float, 1>{beta},
/*dimensions=*/{}));
return SerializeBinaryOperation(::tflite::BuiltinOperator_ADD,
input_tensor_index, beta_tensor_index,
output_tensor_index);
}
if (beta == 0.0f) {
ASSIGN_OR_RETURN(const TensorIndex alpha_tensor_index,
SerializeTensorWithBuffer<float>(
/*buffer=*/std::array<float, 1>{alpha},
/*dimensions=*/{}));
return SerializeBinaryOperation(::tflite::BuiltinOperator_MUL,
input_tensor_index, alpha_tensor_index,
output_tensor_index);
}I feel the optimization of linear operator could be in a separate CL.
if (gemm.c_operand_id && gemm.beta != 0.0f) {lisha guoCan we refine `SerializeLinearOperation` to avoid mul operation, that can be reused here and L.5105.
Good suggestion! Refined `SerializeLinearOperation` in latest PS11.
I don't see much benefit by using Linear operator here, because when calling Linear(input, alpha, 0), it is just a Mul operator? Or did I miss anything?
return SerializeBinaryOperation(::tflite::BuiltinOperator_ADD,lisha guoMaybe `BROADCAST_TO` operator that fills the output with zeros better than add operation.
Added a new method `SerializeBroadcastToOperation` to use `BROADCAST_TO`. Meanwhile, Refactor the `SerializeExpand` function to reuse the `SerializeBroadcastToOperation` function.
Consider using Broadcast(C, output_dimensions) instead?
output_tensor_type = quantized_output->data_type;Was this case missed before? Do we have test coverage for it? @junw...@intel.com @feng...@intel.com
// When alpha is 0, the alpha * A * B term is 0, so skip the matrix
// multiplication computation.This comment seems to be duplicate with the previous one?
// If there's a C term (beta * C), we need to add it to the zero tensor.This comment is confusing, the latest code doesn't add it to the zero tensor. The following condition statement is clear, consider removing this comment.
return SerializeLinearOperation(
c_tensor_info->dimensions, c_tensor_info->data_type,
c_tensor_info->index, output_tensor_index, gemm.beta, 0.0f);If C is a scalar, you may want to broadcast the result of `beta * C` to 2D. Do we have a test case covering it?
ASSIGN_OR_RETURN(const OperatorOffset op_offset,
SerializeLinearOperation(
c_tensor_info->dimensions, c_tensor_info->data_type,
c_tensor_info->index, output_tensor_index_of_beta_c,
gemm.beta, 0.0f));The previous code using Mul operator seems more straightforward? What's the benefit of using Linear operator?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Uploaded PS12 to address the comments. Please take another look. Thanks! @ningx...@intel.com
ASSIGN_OR_RETURN(const TensorIndex constant_tensor_index,Nit: keep naming consistent
```suggestion
ASSIGN_OR_RETURN(const TensorIndex beta_tensor_index,
```
Done
return SerializeBroadcastToOperation(input_tensor_index, input_dimensions,
output_tensor_index);Broadcasting seems to be unnecessary, using Identity instead?
Done
if (alpha == 0.0f) {Agree. I will create another CL to optimize of linear operator separately.
if (gemm.c_operand_id && gemm.beta != 0.0f) {lisha guoCan we refine `SerializeLinearOperation` to avoid mul operation, that can be reused here and L.5105.
Hu, NingxinGood suggestion! Refined `SerializeLinearOperation` in latest PS11.
I don't see much benefit by using Linear operator here, because when calling Linear(input, alpha, 0), it is just a Mul operator? Or did I miss anything?
Done
return SerializeBinaryOperation(::tflite::BuiltinOperator_ADD,lisha guoMaybe `BROADCAST_TO` operator that fills the output with zeros better than add operation.
Hu, NingxinAdded a new method `SerializeBroadcastToOperation` to use `BROADCAST_TO`. Meanwhile, Refactor the `SerializeExpand` function to reuse the `SerializeBroadcastToOperation` function.
Consider using Broadcast(C, output_dimensions) instead?
Done
// When alpha is 0, the alpha * A * B term is 0, so skip the matrix
// multiplication computation.This comment seems to be duplicate with the previous one?
Done
// If there's a C term (beta * C), we need to add it to the zero tensor.This comment is confusing, the latest code doesn't add it to the zero tensor. The following condition statement is clear, consider removing this comment.
Done
return SerializeLinearOperation(
c_tensor_info->dimensions, c_tensor_info->data_type,
c_tensor_info->index, output_tensor_index, gemm.beta, 0.0f);If C is a scalar, you may want to broadcast the result of `beta * C` to 2D. Do we have a test case covering it?
Added another two test cases to cover scalar C and 1D C.
ASSIGN_OR_RETURN(const OperatorOffset op_offset,
SerializeLinearOperation(
c_tensor_info->dimensions, c_tensor_info->data_type,
c_tensor_info->index, output_tensor_index_of_beta_c,
gemm.beta, 0.0f));The previous code using Mul operator seems more straightforward? What's the benefit of using Linear operator?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ASSIGN_OR_RETURN(c_tensor_info, SerializeInputTensorInfo(Move `beta * c` expression here, and define variable `std::optional<TensorIndex> c_expression_index` that can be used in L.5057 and L.5106
output_tensor_type = quantized_output->data_type;Was this case missed before? Do we have test coverage for it? @junw...@intel.com @feng...@intel.com
`output_tensor_dimensions` and `output_tensor_type` are used when [is_emulated_c_expression](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;l=5081?q=GraphBuilderTflite::SerializeGemm&ss=chromium%2Fchromium%2Fsrc) is true, please removed the definition of them and use `output_tensor_info.dimensions` and `output_tensor_info.data_type` directly.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@junw...@intel.com Uploaded PS15 to address the comments. Please take another look. Thanks!
ASSIGN_OR_RETURN(c_tensor_info, SerializeInputTensorInfo(Move `beta * c` expression here, and define variable `std::optional<TensorIndex> c_expression_index` that can be used in L.5057 and L.5106
Sounds good. Moved `beta * c` here in PS15.
output_tensor_type = quantized_output->data_type;Fu, JunweiWas this case missed before? Do we have test coverage for it? @junw...@intel.com @feng...@intel.com
`output_tensor_dimensions` and `output_tensor_type` are used when [is_emulated_c_expression](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;l=5081?q=GraphBuilderTflite::SerializeGemm&ss=chromium%2Fchromium%2Fsrc) is true, please removed the definition of them and use `output_tensor_info.dimensions` and `output_tensor_info.data_type` directly.
Removed these two line:
```
output_tensor_dimensions = quantized_output->dimensions;
output_tensor_type = quantized_output->data_type;
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
ASSIGN_OR_RETURN(c_tensor_info, SerializeInputTensorInfo(lisha guoMove `beta * c` expression here, and define variable `std::optional<TensorIndex> c_expression_index` that can be used in L.5057 and L.5106
Sounds good. Moved `beta * c` here in PS15.
Done
if (c_expression_index) {nit: `if (c_expression_index && !is_emulated_c_expression)`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (gemm.c_operand_id) {Move L.5033 here. `if (gemm.c_operand_id && gemm.beta != 0.0f)`
Uploaded PS18 to fix these two nits. `+1` lost due to uploading new CL. Please take another look. @junw...@intel.com
Move L.5033 here. `if (gemm.c_operand_id && gemm.beta != 0.0f)`
Done
nit: `if (c_expression_index && !is_emulated_c_expression)`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looks good, thanks @guol...@kylinos.cn! We plan to only land critical issue fix before 146 branch cut (Feb 9th). Could we hold this CL landing after that?
lisha guoI feel the optimization of linear operator could be in a separate CL.
Agree. I will create another CL to optimize of linear operator separately.
Acknowledged
output_tensor_type = quantized_output->data_type;Fu, JunweiWas this case missed before? Do we have test coverage for it? @junw...@intel.com @feng...@intel.com
lisha guo`output_tensor_dimensions` and `output_tensor_type` are used when [is_emulated_c_expression](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;l=5081?q=GraphBuilderTflite::SerializeGemm&ss=chromium%2Fchromium%2Fsrc) is true, please removed the definition of them and use `output_tensor_info.dimensions` and `output_tensor_info.data_type` directly.
Removed these two line:
```
output_tensor_dimensions = quantized_output->dimensions;
output_tensor_type = quantized_output->data_type;
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looks good, thanks @guol...@kylinos.cn! We plan to only land critical issue fix before 146 branch cut (Feb 9th). Could we hold this CL landing after that?
Sure. Please holding as your plan. Wishing you everything goes smoothly!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ASSIGN_OR_RETURN(const TensorInfo& a_tensor_info,
SerializeInputTensorInfo(
gemm.a_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex a_tensor_index = a_tensor_info.index;
// The permutation transpose first or second 2-D tensor.
static constexpr std::array<uint32_t, 2> permutation = {1u, 0u};
if (gemm.a_transpose) {
ASSIGN_OR_RETURN(a_tensor_index,
InsertTransposeOperation(a_tensor_info, permutation));
}Move this down to where it is first needed so we don't serialize an unused tensor.
ASSIGN_OR_RETURN(const TensorInfo& b_tensor_info,
SerializeInputTensorInfo(
gemm.b_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex b_tensor_index = b_tensor_info.index;Move this down to where it is first needed so we don't serialize an unused tensor.
ASSIGN_OR_RETURN(const TensorIndex zero_tensor_index,
SerializeTensorWithBuffer<float>(
/*buffer=*/std::array<float, 1>{0.0f},
/*dimensions=*/{}));Move this after the check below so that we only serialize a zero tensor when we actually need it.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ASSIGN_OR_RETURN(const TensorInfo& b_tensor_info,
SerializeInputTensorInfo(
gemm.b_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex b_tensor_index = b_tensor_info.index;Move this down to where it is first needed so we don't serialize an unused tensor.
From Lisha’s patchset21, all operands must be serialized. If there is no this step:
For input operands, a crash may occur at the [FinishAndTakeResult](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=2911) stage;
```
auto get_name_and_index = [&](OperandId operand_id) {
const TensorInfo& info = operand_to_tensor_info_map_.at(operand_id);
```
For output operands, other operators in a graph may still reference this operand.
@lisha please make sure the `C` operand also be serialized, thanks.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ASSIGN_OR_RETURN(const TensorInfo& b_tensor_info,
SerializeInputTensorInfo(
gemm.b_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex b_tensor_index = b_tensor_info.index;Fu, JunweiMove this down to where it is first needed so we don't serialize an unused tensor.
From Lisha’s patchset21, all operands must be serialized. If there is no this step:
For input operands, a crash may occur at the [FinishAndTakeResult](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=2911) stage;
```
auto get_name_and_index = [&](OperandId operand_id) {
const TensorInfo& info = operand_to_tensor_info_map_.at(operand_id);
```
For output operands, other operators in a graph may still reference this operand.
@lisha please make sure the `C` operand also be serialized, thanks.
Good catch. Make sure to serialize A, B and C early and add a one-line comment explaining why.
| Code-Review | +1 |
ASSIGN_OR_RETURN(const TensorInfo& a_tensor_info,
SerializeInputTensorInfo(
gemm.a_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex a_tensor_index = a_tensor_info.index;
// The permutation transpose first or second 2-D tensor.
static constexpr std::array<uint32_t, 2> permutation = {1u, 0u};
if (gemm.a_transpose) {
ASSIGN_OR_RETURN(a_tensor_index,
InsertTransposeOperation(a_tensor_info, permutation));
}Move this down to where it is first needed so we don't serialize an unused tensor.
Acknowledged
ASSIGN_OR_RETURN(const TensorInfo& b_tensor_info,
SerializeInputTensorInfo(
gemm.b_operand_id,
/*quantize_params=*/0,
/*operation_supports_float16=*/false, fuse_dequantize));
TensorIndex b_tensor_index = b_tensor_info.index;Fu, JunweiMove this down to where it is first needed so we don't serialize an unused tensor.
Reilly GrantFrom Lisha’s patchset21, all operands must be serialized. If there is no this step:
For input operands, a crash may occur at the [FinishAndTakeResult](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=2911) stage;
```
auto get_name_and_index = [&](OperandId operand_id) {
const TensorInfo& info = operand_to_tensor_info_map_.at(operand_id);
```
For output operands, other operators in a graph may still reference this operand.
@lisha please make sure the `C` operand also be serialized, thanks.
Good catch. Make sure to serialize A, B and C early and add a one-line comment explaining why.
Acknowledged
ASSIGN_OR_RETURN(const TensorIndex zero_tensor_index,
SerializeTensorWithBuffer<float>(
/*buffer=*/std::array<float, 1>{0.0f},
/*dimensions=*/{}));Move this after the check below so that we only serialize a zero tensor when we actually need it.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@rei...@chromium.org. Added a new comment as your suggestion. Please take another look. Thanks!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |