Hello Alex,
pm.addPass(TFL::CreatePrepareQuantizePass(quant_specs));
pm.addPass(TFL::CreateQuantizePass(
verify_numeric, whole_model_verify, legacy_float_scale,
blocklisted_mlir_op_names, blocklisted_nodes));
pm.addPass(TFL::CreatePostQuantizePass(/*emit_quant_adaptor_ops=*/true));
pm.addPass(TFL::CreateOptimizeOpOrderPass());
pm.addPass(TFL::CreateModifyIONodesPass(input_mlir_type, output_mlir_type));
if (!blocklisted_ops.empty() || !blocklisted_nodes.empty()) {
// If the first or final ops are not quantized, remove QDQ.
pm.addPass(TFL::CreatePostQuantizeRemoveQDQPass());
}
Each pass has their own share of quantization steps, and especially (Prepare|Post)?QuantizePass are the three major ones responsible for. To inspect each pass in detail, you can use tf-opt to manually run MLIR passes on your calibrated model.
Regards,
Tei