Hi Songlin,
Thanks for starting this work. My reply is going to be twofold: how I think we should approach this for the long-term, and how I think we should approach this for the short-term.
Long-term considerations
I've admittedly only thought tangentially about multiselect, and I'm not familiar with how the plugin actually works. I actually want us to step back and consider what multiselect means in the world of v12+ Blockly where we need to consider accessibility implications of selection. For some background context, FocusManager exists to ensure that we synchronize DOM focus with what the user expects is 'focused' (or being currently interacted with) since screen readers rely on the current DOM focus state for determining what to read out (where ARIA and other factors determine the actual text read out). We found during development that this carefully managed state actually simplifies a bunch of other things, including the keyboard navigation cursor and selection. In fact, it simplified selection so much that we completely removed the internal tracking for it and just instead defer to FocusManager (selection now means whatever is currently focused if it's selectable). That was sufficient for single selection, but I had assumed this wouldn't be fully compatible with multiselect.
Thinking about how things should work from a user perspective, there are at least two different ways this could be approached:
Approach 1
Using a spreadsheet metaphor: only one cell in a spreadsheet can ever be typed into which means only one cell can have focus. Multiple cells can be selected, but always exactly one will hold current focus. This makes sense from a modeling perspective, but it may be confusing from a user perspective: it could lead to situations where a user uses ctrl+enter to open a context menu in multiselect and perform an operation only on whichever block current holds focus. This also makes keyboard block movement confusing: is it on the multiselect or on an individual block?
That parallel here probably applies, but there are (at least) three distinct challenges that need to be solved to make it work:
- How do we visually distinguish selection from focus? Today, they are the same. We use the selection highlight to indicate what we deem 'active' focus (that is, the element currently interactive to the user) largely because selection highlighting already existed. We intend for selection and active highlighting to be individually customized, but that's not quite possible yet today. That being said, we don't know how we would handle the multiselect case where it becomes clear that there needs to be a visual distinction.
- How do we actually model the domain logic to make this work?
- Similar to the visual element, how does screen reader output work for multiselect? Screen reader work is very early right now in Blockly, but the entirety of the focus system was designed with it in mind, so the question definitely needs to be asked.
Approach 2Considering cases of selection more than 1 thing to be a case where the 'selection' itself has 'focus' (e.g. as a transient construct). This could simplify a number of things:
- This makes it easier to ensure consistency between keyboard and mouse usage.
- We no longer have to visually distinguish between active focus and multiselect: just make multiselect the thing with focus. We can specialize the styling to make it more obvious that multiple elements hold 'focus'.
- It simplifies screen reader support: the multiselection would be an element that could hold DOM focus and thus have its own ARIA labels/context set up to provide an auditory context for the group of selectables.
- The domain logic may be closer to what's implemented in the plugin today (though I only looked briefly), and will probably be more interoperative with v12+ Blockly. We would need to adjust the focusNode logic in places to instead focus on the multiselection if present rather than a single node. This might require some finnickiness in ways I don't fully know the implications of: we would need to support representing a contextual object in dragging and other operations rather than a singular focusable node. This may make multiselect in plugin form prohibitively difficult vs. adding it to core (which is something we've wanted to do at some point, regardless).
Conclusion
My suggestion: We should centralize on what the user experience should be from mouse, keyboard, and screen reader perspectives. After that we can determine what the ideal solution is within core Blockly, and then consider how we might support that implementation in the plugin. I suggest this approach because I can't be sure the current way the multiselect plugin works will even be compatible with the focus system or screen reader support--given how much of core Blockly doesn't "just work" for these use cases, I'm very much expecting multiselect to be in a similar situation. Also not discussed here but equally important: approaches 1 and 2 above will have implications for keyboard navigation since we will need to consider how multiselect operates both for initiating/changing a multiselect and for operating on a multiselect (which requires some level of dynamic detection for navigation).
Short-term considerations
The biggest unknown to me is how much we will be able to make this work via a plugin, or if we're in a situation where multiselect can only reasonably be made to work with core Blockly changes (which would mean possibly targeting the v13 release next year). I really hope we can at least introduce a stopgap for v12 so that we're not breaking multiselect for all v12 users--that is not what we want. It may be the case that the idea here of using a FocusableNode to represent the multiselection may actually work correctly, and could provide an actual basis for approach 2 above. The main challenge that I expect to happen here is around all the focusNode() calls which are very much assuming a block (in many cases) is the thing to focus.
I will need to look into the multiselect plugin in more detail to provide more concrete recommendations and to properly think about your existing changes, but I might not have time to do that for a few weeks.
Regards,
Ben