I thought that this conversation should have its own thread.
The theory is that fusion is not intended for finding clusters of ops which map to specific backend supported operations (e.g. 'average pool'). So i am going to migrate our backend to use Call instead of Fusion. As previously discussed, I am not using CustomCall because it doesn't allow the op to find the replaced instructions safely.
Call is not as easy to use as I first thought it would be.
1) while call allows me to get back to the original instructions, it doesn't have somewhere to store the general type of the replacement. I have used the name of the sub-computation. That works, although the name is mangled by a uniquifier. I can work around this, but I am now assuming that the uniquifier will only append a '.' followed by arbitrary data, and no other mangling of the name prior to the '.' will occur. I don't think that this is a big issue.
2) unfortunately the ops extraction and replace by Call code isn't as sophisticated as the ops and replace by Fusion code. Specifically the fusion code is capable of taking a sub graph with where some nodes are used by ops not in the sub-graph, and doing the right thing - duplicating the node into the fusion computation and leaving the original in the main graph. Consider extracting the const+add from this graph:
const param
\ /
--- add ----
| |
| |
\ /
sub
|
OutlineExpressionFromComputation cannot do it, but CreateFusionInstruction can. The error from OutlineExpressionFromComputation would be "The subcomputation to outline has multiple outputs:" because it doesn't allow for the const to remain outside the fusion and be duplicated within it.
Main comp Fusion comp
const param const param
| | \ /
| fusion --add--
\ / |
\ /
-- sub --
|
The solutions seem to be:
1) use a common code base for OutlineExpressionFromComputation and CreateFusionInstruction.
2) to keep using CreateFusionInstruction, without extending the enumeration, but to add general annotations to the HloInstruction.
Option 1 seems like a better solution if you intend to eliminate fusion. Would it be ok for me to make the OutlineExpressionFromComputation code use the fusion code for extraction?
David