Intro
This
text aims to provide a short introduction and spark a discussion on
the topic/problem of implementing a cross-browser solution for
seamless remote-local clipboard sync.
Recently there’s been some activity on the noVNC clipboard front with PR #1993 [1] being merged, but held off for now from being part of the 1.7 release due to browser interoperability concerns. The reasoning here being we’d like to get some additional signals from the engine teams and browser vendors first before committing to engine-specific UX.
Note:
throughout this text Blink, Gecko, and WebKit, refer to browser
engines, while Chrome, Firefox, and Safari, refer to their mainstream
implementations. Other browsers exist, but these three are spoken of
as representative.
Lack
of consensus surrounding permissions
The async clipboard API used in PR #1993 [1] relies on two permissions: clipboard-read and clipboard-write. When these are supported, automatic local-to-remote clipboard sync is possible, providing a similar seamless user experience as to when copying and pasting in any given editable context on the web.
Unfortunately
this
approach is
only embraced by Blink
[2],
while
Gecko
[3,
4,
5]
and
WebKit
[4,
5,
6]
have
chosen a
more restrictive model,
one
which aligns more
with the
W3C spec direction
[7].
In fact, these clipboard
permissions
used to be supported by the spec, but
it seems their
days
are numbered.
The read permissions have already been removed [8]
from
the spec, and the write permissions are likely to follow [9],
though
Chrome has not yet aligned with this and progress appears stalled.
Either Chrome
aligns
with
the others on this, or continue to support this
permissions model independently
on their end. As for Firefox
and
WebKit,
they’ve been
clear about their stance on this. I’ve
forwarded
our use case in
bug reports to them
[10,
11].
The
timing of reading and writing
Testing
and documentation of the async clipboard API implementations reveal
these prominent differences in particular:
Read & Write
Chrome 142
Permissions supported
No user gestures required
Unlimited grace period
Firefox 144
Permissions unsupported
User gestures required
5 second grace period
Safari 26
Permissions unsupported
User gesture required
1 sec grace period
Read-specific
Chrome 142
No additional prompt
Firefox 144
Native paste prompt
Safari 26
Native paste prompt
For clipboard reading -- the more privacy sensitive -- Firefox and Safari add an additional native prompt for that extra unambiguous user approval, even when operating within a user gesture context. This complicates the user experience since the call must be tied to a user gesture and approve a subsequent prompt every time. Note that Firefox (not sure about Safari) supports suppressing the paste prompt under same-origin conditions [12]; sadly though noVNC cannot hope to meet this demand.
A central problem for noVNC’s (and similar web apps) use case is the timing of clipboard operations (in part highlighted in a W3C issue some years ago [13]), where clipboard reading is typically the hardest one to achieve cross-browser properly:
We cannot know when the clipboard has changed
To maintain sync, we must unconditionally read at specific times
Delegating the read initiative to users undermines the aspect of automation
In practice this means reading the system clipboard and syncing with the server must be done upon document focus i.e. when the user returns to the noVNC tab. This is the model in PR #1993 [1] and earlier iterations such as PR #1347 [14]. With persistent permissions this works perfectly. Lacking these, however, every focus would trigger a recurring paste prompt which simply put is a poor user experience with frustrating ergonomics that quickly becomes annoying.
The timing of writing to the system clipboard is a bit different. Naturally, for the local and remote to be in sync, writing to the system clipboard must be done every time noVNC initiates a copy process, i.e. when selecting a piece of text or pressing Ctrl+C (keep in mind that Linux features a primary and secondary clipboard). However the copy process completes some time after input handlers have exited, making it difficult to associate the write with a valid user gesture. On the other hand, persistent clipboard-write permission solves this elegantly as we’re allowed to programmatically update the clipboard at arbitrary moments.
One hacky workaround for restrictive browsers involves timestamp-based logic: listen to all keyboard and mouse events (should not impose any performance issues), infer when copying occurred and conditionally update the clipboard.
Promising outlook
There’s an experimental feature currently being evaluated called ”clipboardchange” event [15, 16]. It allows the browser to
Detect clipboard changes while document is unfocused
Fire an event upon refocus
Reveal MIME types only, not actual content, thus preserving privacy
This could potentially be combined with some sort of notification to sync clipboard. As a result, we wouldn’t have to unconditionally read the system clipboard, eliminating repeated prompts.
Concluding thoughts
Chrome simply offers better UX for the async clipboard integration, primarily due to the persistent permissions model. However the W3C spec is moving away from such permissions and the browser landscape appears unlikely to align with Chrome’s approach. Since the UX difference is quite palpable though, there’s also the aspect of inadvertently pushing Chromium browsers onto noVNC users.
Ultimately though, should noVNC pursue a browser-divergent clipboard UX, with one branch built on potentially unstable features, and one offering a poor user experience?
https://groups.google.com/a/chromium.org/g/blink-dev/c/epeaao7l13M/m/h3FONN3RAwAJ
https://groups.google.com/a/mozilla.org/g/dev-platform/c/doxNXSEFtEE/m/CGsR1fLlAgAJ