The Tight encoding specification requires rectangles to be <= 2048 pixels in width, but there isn't any documented limit on the rectangle size. I don't think that you're playing with fire necessarily, although the Tight encoder has never been tested with rectangles > 64k, so I can't guarantee that there aren't hidden bugs. However, I question whether Tight encoding is the most appropriate way to transfer pixels in a loopback environment. It seems like you might be better served by transferring the pixels using Raw encoding, which would require more bus bandwidth but less CPU time.
Referring to https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf, a big reason why TurboVNC itself doesn't use larger rectangles is that there's a tradeoff in terms of encoding efficiency. The larger the rectangle, the more likely it is that the number of unique colors in the rectangle will exceed the Tight palette threshold. Thus, as the rectangle size increases, you will reach a point at which only JPEG is used to encode the non-solid subrectangles. At that point, there isn't much benefit to the complexity of Tight encoding, and you'd probably get better performance by simply encoding every rectangle as a pure JPEG image. I am not sure whether 1 megapixel is beyond that point, but given that the palette threshold is low (24 or 96 colors, depending on the Tight compression level), it wouldn't surprise me if 1-megapixel rectangles are almost always encoded as JPEG. In that case, the only real benefit you'd get from Tight encoding is a slight reduction in the bitstream size if there are huge areas of solid color, since the Tight encoder can encode those as a bounding box and fill color (whereas JPEG has a not-insignificant amount of overhead, both in terms of compute time and bitstream size, when encoding a single-color image.) However, I don't know whether that benefit is worth the additional computational overhead of analyzing the rectangle, nor whether it is worth the additional bitstream size overhead of dividing non-solid areas of the rectangle into multiple JPEG subrectangles (as opposed to sending the whole rectangle as a single JPEG image.)
That explanation also serves as an
explanation for why I wouldn't be willing to add a Tight
compression level with a rectangle size of 1048576. I would,
however, be willing to support the pure RFB JPEG encoding type
in TurboVNC, if it proves to be of any benefit. That encoding
type is dead simple and would involve merely passing every RFB
rectangle directly to libjpeg-turbo.
DRC
--
You received this message because you are subscribed to the Google Groups "TurboVNC Developer Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to turbovnc-deve...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/turbovnc-devel/df98e4d5-7d4c-4f8c-acc1-4cca24b59174n%40googlegroups.com.
I am not opposed to implementing pure JPEG
encoding (especially if I can get funded development for it),
but I probably wouldn't expose it in the TurboVNC Viewer GUI
(yet), since it has a somewhat esoteric use case. The
implementation would be relatively simple. I would just feed
the rectangle directly into libjpeg-turbo using the same JPEG
quality and subsampling settings that are currently used by the
Tight encoder. The pure JPEG encoding type specifies that the
encoder can send motion-JPEG frames without Huffman or
quantization tables, which would
avoid some of the JPEG header overhead with smaller rectangles
and might even eliminate some of the advantage of the hybrid
Tight encoding methods. However, the TurboJPEG API would need
to be extended to support the creation of motion-JPEG frames.
Thus, in order to implement the feature the right way, I would
ideally want to secure funding in order to extend the TurboJPEG
API and do a cursory low-level study regarding the performance
advantages of motion-JPEG frames.
As an aside, I am also interested in
replacing the existing Lossless Tight encoding methods with
TightPNG, but that would require extensive low-level research
that I don't have the funding or time to conduct right now. It
would be great to completely revisit the encoder design from the
ground up, obtaining completely new test datasets using modern
applications. (The 2D datasets in particular use obsolete
applications, window managers, and X11 rendering paradigms.
Modern X11 applications are much more likely to use image-based
rendering, which would likely tilt the performance scales more
in favor of JPEG than indexed color encoding.) However, I don't
predict that anyone cares enough about that to pay for the labor
required to do it. Supporting pure JPEG encoding would be a lot
more realistic in terms of funding, since that is probably more
like a 15-25 hour project than a hundreds-of-hours project.
To view this discussion on the web visit https://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com.
I did some low-level experiments with the TurboVNC Benchmark Tools (https://github.com/TurboVNC/vncbenchtools), comparing the existing TurboVNC encoder, accelerated with the Intel zlib implementation and using the "Tight + Perceptually Lossless JPEG" and "Tight + Medium-Quality JPEG" presets, against pure JPEG encoding with the same JPEG quality and subsampling levels. The results were interesting.
Perceptually Lossless JPEG:
As expected based on prior research
(https://turbovnc.org/pmwiki/uploads/About/tighttoturbo.pdf),
pure JPEG with no other modifications produced worse (often much
worse) compression with all datasets except Lightscape,
GLXSpheres, and Quake 3, which compressed about 4-5% better.
(Catia, Teamcenter Vis, Unigraphics NX, and Google Earth
regressed by less than 10%.) Pure JPEG with no other
modifications also produced worse (often much worse) performance
with all datasets except Pro/E (+8%) and Quake 3 (+4%). (3D
Studio Max, Ensight, Lightscape, and Google Earth regressed by
less than 10%.)
However, modifying the TurboJPEG API
library so that it generates "abbreviated image datastreams",
i.e. JPEG images with no embedded tables (the equivalent of
motion-JPEG frames), improved the performance of pure JPEG
somewhat. Now Lightscape (+5%), GLXSpheres (+6%), Google Earth
(+9%), and Quake 3 (+16%) compressed better with pure JPEG than
with the TurboVNC encoder. (photos-24, kde-hearts-24, Catia,
Teamcenter Vis, and Unigraphics NX regressed by less than 10%.)
As above, only Pro/E (+8%) and Quake 3 (+5%) were faster with
pure JPEG. (3D Studio Max, Ensight, Lightscape, and Google
Earth regressed by less than 10%.)
I then repeated the same tests with the Interframe Comparison Engine enabled. Interframe comparison significantly improves compression with the 2D datasets, with mixed results for the 3D datasets, but the relative differences between the TurboVNC encoder and pure JPEG were pretty similar with interframe comparison enabled vs. disabled. With interframe comparison enabled, Pro/E now compressed better with pure JPEG than with the TurboVNC encoder, and Unigraphics NX compressed about the same. However, all of the other datasets were generally in the same relative ranges as above (give or take.)
Medium-Quality JPEG:
When you reduce the JPEG quality, the size of JPEG rectangles and subrectangles decreases, but the size of indexed-color rectangles stays the same. Thus, pure JPEG was more advantageous with medium-quality JPEG than it was with perceptually lossless JPEG. In this case, I tested only with interframe comparison enabled (since the "Tight + Medium-Quality JPEG" preset always enables it) and only with abbreviated image datastreams.
kde-hearts-16 (+4%), photos-24 (+3%),
kde-hearts-24 (+11%), Lightscape (+5%), Pro/E (+6%), Unigraphics
NX (+4%), GLXSpheres (+12%), Google Earth (+24%), and Quake 3
(+39%) compressed better with pure JPEG than with the TurboVNC
Encoder. (Catia and Teamcenter Vis regressed by less than
10%.) 3D Studio Max (+13%), Ensight (+38%), and Pro/E (+13%)
were faster with pure JPEG. (Lightscape, SolidWorks, Google
Earth, and Quake 3 regressed by less than 10%.)
AVX2 Instructions:
The initial tests were conducted on an older machine that lacks AVX2 instructions, so I re-ran the tests on a newer machine. This gave pure JPEG more of a performance advantage, since the Intel zlib implementation cannot use AVX2 instructions but libjpeg-turbo can.
With perceptually lossless JPEG: Ensight (+3%), Pro/E (+11%), and Quake 3 (+7%) were faster with pure JPEG than with the TurboVNC encoder, and 3D Studio Max was about the same. (Lightscape and Google Earth regressed by less than 10%.)
With medium-quality JPEG: 3D Studio Max
(+26%), Ensight (+57%), Pro/E (+18%), SolidWorks (+8%), and
Quake 3 (+3%) were faster with pure JPEG, and Maya was about the
same. (kde-hearts-24, Catia, Lightscape, and Google Earth
regressed by less than 10%.)
Caveat:
The datasets in question were captured in
the early 2000s (the 3D datasets in 2008 and the 2D datasets
years earlier), so many of them represent outdated workloads.
(Most modern X11 applications use some form of image-based
rendering rather than X11 primitives.) Also, because of
limitations in the benchmark tools (which were inherited from
TightVNC), the datasets had to be generated using a very old VNC
server (TightVNC 1.3.9) and viewer (RealVNC 3.3.6) and an RFB
proxy that sat between the two. That infrastructure was slow
and effectively dropped a lot of frames, so the session captures
and benchmark tools are not the best simulation of the TurboVNC
Server. It would not surprise me if pure JPEG performs better
in real-world usage than is reflected above.
General Conclusions:
Unsurprisingly, the TurboVNC encoder is
the most advantageous, relative to pure JPEG, on older
(X11-primitive-based) workloads, workloads with fewer unique
colors, and workloads with large areas of solid color. Pure
JPEG is the most advantageous on image-based workloads,
workloads with more unique colors, and workloads with few areas
of solid color. Pure JPEG also has more of an advantage when
the JPEG quality is decreased and when AVX2 instructions are
available.
It seems as if pure JPEG encoding is advantageous enough in enough cases to justify its existence. I will look at including it in the next major release of TurboVNC, along with GUI modifications (https://github.com/TurboVNC/turbovnc/issues/70, as well as exposing the CompatGUI parameter in the GUI) that will make it more straightforward to enable non-Tight encodings.
I suspect that, if I were to completely
revisit my analysis from 2008 and develop entirely new datasets,
I would find little justification for indexed color subencoding
with modern applications. That would mean that most of the
advantage of the TurboVNC encoder these days comes from its
ability to send large areas of solid color using only a few
bytes. Both X11 and RFB were designed around the limitations of
1980s systems (including the need to support single-buffered
graphics systems.) Wayland jettisons the X11 legacy, but there
is also a burning need for a more modern open source/open
standard remote display protocol that is not beholden to the RFB
legacy, preferably a protocol that is a better fit for
image-based workloads, Wayland, GPU-resident framebuffers, and
modern video codecs. See
https://www.reddit.com/r/linux_gaming/comments/yvjqby/comment/jvricah/?utm_source=reddit&utm_medium=web2x&context=3,
https://github.com/TurboVNC/turbovnc/issues/18, and
https://github.com/TurboVNC/turbovnc/issues/19, and
https://github.com/TurboVNC/turbovnc/issues/373 for more of my
musings on that topic. Do I think that anyone will ever fund
that kind of blue-sky research in an open source project such as
this? Probably not. TurboVNC is innovative compared to other
VNC solutions and maybe compared to most (but not all) open
source remote display solutions, but there are proprietary
solutions these days that do a lot of things that VNC will never
be able to do. (Let's start with streaming over UDP, which the
RFB protocol could never support.) People mostly use TurboVNC
because it's free and good enough, so I don't foresee being able
to do much more with the protocol other than minor tweaks like
this that allow it to get out of the way of certain use cases.
To view this discussion on the web visit https://groups.google.com/d/msgid/turbovnc-devel/ed615d0d-4a33-440b-821e-39e4bd75f00en%40googlegroups.com.
To clarify and tie my conclusions below to the comment I made previously about rectangle sizes:
One reason why the 2D datasets, which mostly represent legacy (primitive-based and single-buffered, as opposed to image-based and double-buffered) X11 rendering workloads, perform best with the TurboVNC encoder is that they have relatively small framebuffer updates. To put numbers on this, here are the average framebuffer update rectangle sizes (in pixels) of the various datasets:
slashdot-24: 3467
photos-24: 2962
kde-hearts-24: 1854
3dsmax-04-24: 998105
catia-02-24: 1005257
ensight-03-24:
837883
light-08-24:
681317
maya-02-24:
985356
proe-04-24:
843144
sw-01-24: 915438
tcvis-01-24:
786859
ugnx-01-24: 793831
glxspheres-24: 478242
googleearth-24: 8898
q3demo-24: 14875
The smaller the rectangle, the greater the
chance that it will have a low enough number of unique colors to
qualify for indexed color subencoding. More modern image-based
workloads usually double buffer, so the rendering occurs
off-screen, and the entire back buffer is swapped to the X11
display in one throw. Such workloads generally have large
framebuffer update rectangles. That type of workload is very
similar to what VirtualGL does, so the 3D datasets (which
consist of OpenGL applications and Viewperf datasets running
with VirtualGL) are more reflective of modern X11 applications.
However, some of those Viewperf datasets simulate wireframe
modes in certain CAD applications, so they have relatively low
numbers of unique colors and still benefit from the TurboVNC
encoder (relative to pure JPEG or modern video codecs such as
H.264.) Other Viewperf datasets have large areas of solid color
and also benefit from the TurboVNC encoder. Applications such
as games or Google Earth that fill the whole screen, render a
large number of unique colors, and render few areas of solid
color, are the best candidates for pure JPEG or video codecs. The small rectangle
size in the Google Earth and Quake 3 datasets is likely a
result of the aforementioned ancient session capture
infrastructure. However, even with those small
rectangles, both datasets generally benefited from pure JPEG
encoding because of their high color counts. On the flip side,
some of the 3D datasets (Catia and Teamcenter Vis, for instance)
never benefited from pure JPEG, despite having large rectangle
sizes, because of their low color counts. It is also worth
mentioning that JPEG is designed to compress continuous-tone
images, so it does a relatively poor job of compressing sharp
features, such as those generated by wireframe modes in CAD
applications. Wireframe modes were once more common, because
they provided a way to smoothly interact with models that
couldn't otherwise be rendered in real time by the slow 3D
accelerators available at the time. (The first 3D accelerators
I worked with in the mid 1990s, based on the 3Dlabs GLINT chip,
could render about 300k polys/sec.) Those modes are less common
these days, but they still exist.
tl;dr: The TurboVNC encoder is a
compromise that maximizes performance across all of those
application categories as best it can, but there are specific
application categories for which a more video-like encoder is a
better solution. The 2D datasets are the same datasets that
Constantin used when designing the TightVNC encoder, so one of
the goals of the TurboVNC encoder overhaul in 2008 was to
provide similar compression ratios on those datasets relative to
TightVNC 1.3.x (to convince TightVNC users that they could
switch to TurboVNC without losing any performance on
low-bandwidth networks) while providing optimal compression
ratios and performance for 3D applications running with
VirtualGL.
DRC
So, at this point, we have several potential solutions:
1) The easiest solution is simply to allow the Tight maximum rectangle size to be changed (perhaps via an environment variable.)
2) The next easiest solution is to disable
all subencodings that use zlib (perhaps when the compression
level is 0.) This could be combined with Solution 1.
3) The hardest solution is to implement
pure JPEG encoding. Part of the challenge here is the fact that
the JPEG RFB encoding doesn't define a header. Per the spec,
you have to keep reading until you find a JPEG EOF marker, which
is not only hard to implement but also represents a mild DoS
risk. (I say "mild" because the attack would have to be
implemented in the server, and the risk would be limited to
freezing any connected viewers.) Also, to take full advantage
of this feature, the TurboJPEG API would need to be extended to
handle image-only datastreams.
If (1) is the minimum necessary to solve
the problem from your point of view, then let's just do that. I
personally can't find any compelling overall performance
advantage to (2) or (3), which makes me reluctant to spend any
time implementing one of those solutions in an official
capacity. But I'm OK with an environment variable that remains
undocumented until/unless it proves generally useful.
To view this discussion on the web visit https://groups.google.com/d/msgid/turbovnc-devel/5a5ea6c3-602f-4027-8eed-cc66136bd34en%40googlegroups.com.
I just pushed a commit to both main (3.1.x Stable) and dev (3.2 Evolving) that allows you to change the maximum Tight subrectangle size via an undocumented environment variable (TVNC_MAXTIGHTRECTSIZE). Please let me know if that doesn't work for some reason.
DRC
To view this discussion on the web visit https://groups.google.com/d/msgid/turbovnc-devel/a99d7c00-8931-4020-8113-242e2dbea4fbn%40googlegroups.com.