On 28/07/18 07:46, Mark Mielke wrote:
> As I have been typing with TigerVNC 1.9.0, I noticed a very fine flickering
> (almost unnoticeable) of the pixels of the line or two of text in my
> terminal window just above the cursor. I doesn't happen in 1.8.0 or prior.
> I believe I am seeing an artifact of this feature from the release notes:
> - Automatic "repair" of JPEG artefacts on screen in all servers
Yeah, that sounds the most likely. I am however unable to reproduce the
effect. I don't suppose you could make a recording?
What are your network conditions like?
Have you made any changes to the compression settings?
Which TigerVNC server is this?
> I'm wondering if the sensitivity for when to use this feature might be too
> high. What is the threshold for this taking effect?
The feature is always running as soon as JPEG is used. How quick it is
depends on how much bandwidth it is detecting. It tries to make sure the
repair is only using "spare" bandwidth.
> Also, I wonder why the few lines above might be flickering, since according
> to my understanding - they are not changing. Why isn't only the cursor and
> characters being inserted refreshing? Why might it stretch a row or two of
> text above? It makes me wonder if there might be a calculation problem
> here, where the JPEG image being applied is larger than it needs to be, and
> this makes the correction larger than it needs to be?
Sometimes a larger part of the screen is updated by the application than
strictly necessary. However a default configured TigerVNC will detect
such redundant changes, so it is a bit odd...
> I suspect most people wouldn't notice this. I also have a particular
> configuration which might be unusual and might exaggerate the impact of
> artifacts. In my scenario, I typically have a VNC client on Windows 10,
> connecting to a CentOS 7 VM, and then I have a VPN connection from the
> CentOS 7 VM connecting to a remote Oracle Linux 7 server. However, I don't
> think this is the cause of the problem - it is just that my scenario might
> make it slightly more perceptible. Also, to be clear - I have TigerVNC
> 1.9.0 as client and server on all points.
So you run a second vncviewer on the CentOS machine to connect to a
second server on the Oracle machine?
This is the part that I really think is the biggest concern. I can regularly see a much larger part of the screen refreshing then I think should be necessary. I will see if I can figure out how to video it to show you. As I move the text cursor - I clearly seems like a section a 3 or 4 lines above the text cursor, and 5 or 10 characters to the left and right of the text cursor that all shimmer/flash as I type. I think if the clip region was smaller, I wouldn't be able to see anything.
Also, I don't understand why it would be sending JPEG for something this small? Is there a minimum JPEG size somewhere in there, and the logic to send JPEG is kicking in unnecessarily, and then the logic to repair the JPEG is kicking in unnecessarliy as a result?> I suspect most people wouldn't notice this. I also have a particular
> configuration which might be unusual and might exaggerate the impact of
> artifacts. In my scenario, I typically have a VNC client on Windows 10,
> connecting to a CentOS 7 VM, and then I have a VPN connection from the
> CentOS 7 VM connecting to a remote Oracle Linux 7 server. However, I don't
> think this is the cause of the problem - it is just that my scenario might
> make it slightly more perceptible. Also, to be clear - I have TigerVNC
> 1.9.0 as client and server on all points.
So you run a second vncviewer on the CentOS machine to connect to a
second server on the Oracle machine?
Yes:1) User (me) on Windows 10, running TigerVNC 1.9.0 x86_64 from bintray.2) Server on CentOS 7, running custom built TigerVNC 1.9.0 x86_64 from source.3) Client on CentOS 7, inside the CentOS 7 session started in 2, running the same client built for 2.4) Server on Oracle Linux 7, running custom built TigerVNC 1.9.0 x86_64 from source.I could try RHEL as well if that would help, but I don't expect any difference as they are all really the same software versions in place. Since I last posted:1) The connection from 1)-2) above is now -CompressLevel 1 -NoJpeg, which should eliminate that leg from being a factor.2) I have applied a few additional patches since 1.9.0 that seemed important for other reasons. They have not made a difference to the symptoms I report above.I suppose there is some possibility that the RHEL/CentOS/OL builds I am doing are using the GCC that comes with these, and it has a bug. That seemed a remote possibility to me though, and also difficult to test for. :-) I'm more hoping that by describing my symptoms, it will clue the experts such as yourself on what code to look at, and you'll easily spot the place where "-1" should be "+1" or some such accident. :-)
On 14/08/18 00:57, Mark Mielke wrote:
> Also, I don't understand why it would be sending JPEG for something this
> small? Is there a minimum JPEG size somewhere in there, and the logic to
> send JPEG is kicking in unnecessarily, and then the logic to repair the
> JPEG is kicking in unnecessarliy as a result?
The algorithm was written with the idea that CPU was the bottle neck and
using JPEG was generally faster. So it can be overly aggressive in using
JPEG.
> 1) The connection from 1)-2) above is now -CompressLevel 1 -NoJpeg, which
> should eliminate that leg from being a factor.
Do you also use -CompressLevel 1 for the second hop? Because your video
clearly shows that something isn't working with the comparison logic,
and that setting unfortunately has the side effect of disabling it.
Could you try using -CompressLevel 2 insted? Or start the server with
-CompareFB 1?
On Wed, Aug 15, 2018 at 4:06 AM Pierre Ossman <oss...@cendio.se> wrote:On 14/08/18 00:57, Mark Mielke wrote:> Also, I don't understand why it would be sending JPEG for something this
> small? Is there a minimum JPEG size somewhere in there, and the logic to
> send JPEG is kicking in unnecessarily, and then the logic to repair the
> JPEG is kicking in unnecessarliy as a result?
The algorithm was written with the idea that CPU was the bottle neck and
using JPEG was generally faster. So it can be overly aggressive in using
JPEG.Why such a big are for JPEG, though? Why not just the before and after of where the text cursor was, and the pixels that changed in the frame buffer from before and after? I'm feeling doubtful that xfc4-terminal or other is flashing the screen... but maybe it is?> 1) The connection from 1)-2) above is now -CompressLevel 1 -NoJpeg, which
> should eliminate that leg from being a factor.
Do you also use -CompressLevel 1 for the second hop? Because your video
clearly shows that something isn't working with the comparison logic,
and that setting unfortunately has the side effect of disabling it.Yes. I thought CompressLevel 1 would minimize latency and CPU overhead, and it seemed to have the best results for me in reducing my perception of this artifact.Could you try using -CompressLevel 2 insted? Or start the server with
-CompareFB 1?If the -CompareFB 1 is important, I can try it. But I would lose my dozens of active windows and sessions, so if CompressLevel 2 is also satisfactory, please see this link:
On perception, I found CompressLevel 2 to make things worse. The MP4 has mostly captured the experience that I see. (I had been worried the MP4 would introduce its own artifacts, or soften the artifacts I did see... but it's pretty close...)
On 15/08/18 10:26, Mark Mielke wrote:
> If the -CompareFB 1 is important, I can try it. But I would lose my dozens
> of active windows and sessions, so if CompressLevel 2 is also satisfactory,
> please see this link:
They should have the same effect, but apparently it's not working...
Just to make sure, could you try -CompressLevel 2 on the first client as
well?
And could you open the options dialogs on both clients while connected
to verify that the setting has truly taken effect?
You can also check the logs of the two VNC servers. The comparing
tracker will output some statistics on each disconnect. It should
indicate a 1:1 ratio (i.e. that it wasn't used).
Wed Aug 15 04:10:48 2018VNCSConnST: Server default pixel format depth 24 (32bpp) little-endian rgb888VNCSConnST: Client pixel format depth 24 (32bpp) little-endian rgb888Wed Aug 15 04:17:38 2018Connections: closed: 192.168.1.140::50314 (Clean disconnection)EncodeManager: Framebuffer updates: 1273EncodeManager: Tight:EncodeManager: Solid: 1.605 krects, 16.6545 MpixelsEncodeManager: 25.0781 KiB (1:2594.92 ratio)EncodeManager: Bitmap RLE: 262 rects, 52.233 kpixelsEncodeManager: 7.72363 KiB (1:26.8145 ratio)EncodeManager: Indexed RLE: 2.443 krects, 11.844 MpixelsEncodeManager: 2.70307 MiB (1:16.7252 ratio)EncodeManager: Full Colour: 1.719 krects, 18.9703 MpixelsEncodeManager: 8.48579 MiB (1:8.53021 ratio)EncodeManager: Total: 6.029 krects, 47.5211 MpixelsEncodeManager: 11.2209 MiB (1:16.1616 ratio)TLS: TLS session wasn't terminated gracefullyComparingUpdateTracker: 2.29322 Gpixels in / 46.1928 Mpixels outComparingUpdateTracker: (1:49.6446 ratio)Window manager warning: Invalid WM_TRANSIENT_FOR window 0x38000f1 specified for 0x38000f8 (Network Co).
Wed Aug 15 04:11:31 2018VNCSConnST: Server default pixel format depth 24 (32bpp) little-endian rgb888VNCSConnST: Client pixel format depth 24 (32bpp) little-endian rgb888Wed Aug 15 04:17:29 2018Connections: closed: 10.176.154.179::52376 (Clean disconnection)EncodeManager: Framebuffer updates: 1229EncodeManager: Tight:EncodeManager: Solid: 243 rects, 3.14434 MpixelsEncodeManager: 3.79688 KiB (1:3235.67 ratio)EncodeManager: Bitmap RLE: 106 rects, 26.722 kpixelsEncodeManager: 3.17969 KiB (1:33.2187 ratio)EncodeManager: Indexed RLE: 1.184 krects, 4.42461 MpixelsEncodeManager: 954.189 KiB (1:18.128 ratio)EncodeManager: Full Colour: 370 rects, 1.35482 MpixelsEncodeManager: 372.076 KiB (1:14.2352 ratio)EncodeManager: Tight (JPEG):EncodeManager: Full Colour: 753 rects, 6.42291 MpixelsEncodeManager: 2.16377 MiB (1:11.3275 ratio)EncodeManager: Total: 2.656 krects, 15.3734 MpixelsEncodeManager: 3.46577 MiB (1:16.93 ratio)TLS: TLS session wasn't terminated gracefullyComparingUpdateTracker: 185.522 Mpixels in / 9.86783 Mpixels outComparingUpdateTracker: (1:18.8007 ratio)
-bash-4.2$ rpm -q xorg-x11-server-Xorgxorg-x11-server-Xorg-1.19.5-5.el7.x86_64-bash-4.2$ ldd /usr/bin/Xvnc | grep jpeglibjpeg.so.62 => /usr/lib64/libjpeg.so.62 (0x00007effb3341000)-bash-4.2$ rpm -q -f /usr/lib64/libjpeg.so.62libjpeg-turbo-1.2.90-5.el7.x86_64
On 15/08/18 10:58, Mark Mielke wrote:
> ComparingUpdateTracker: 185.522 Mpixels in / 9.86783 Mpixels out
> ComparingUpdateTracker: (1:18.8007 ratio)
Same thing here. So why are you seeing those artefacts...
Can you do a setup with just a single hop and see if the problem remains?
> Tomorrow I'm going to go into work and see if I can reproduce the artifacts
> in reverse, connecting back home. Not sure if it'll prove or disprove
> anything, but it will add a data point for consideration.
Please do. I have no good idea what's going on right now.
I haven't finished investigating (in between other things). But some strangeness to keep you up-to-date on:1) I tried from Windows 7 notebook over VPN direct with one hop only. Result: I didn't see the artifacts.2) I tried from Windows 7 notebook (with integrated GPU?) over wireless, with the same two hops, Result: I didn't see the artifacts.3) I tried from Windows 10 desktop (with nVidia dedicated GPU) over 1 Gbit/s network, with the same two hops. Result: Artifacts visible as before.I don't know which is the variable here that is causing the artifacts, but I am planning to come up with tests to try and isolate:1) Windows 7 vs Windows 10 as client?2) Integrated GPU vs nVidia dedicated GPU? I tried to disable 2D/3D "enhancements"... no change so far, but I'm also not sure I successfully disabled it yet either.3) Monitor? (I tried to adjust the monitoring "enhancements"... no change, and also that doesn't make sense that I could record the problem as that should be downstream of the monitor...)4) Wireless vs Wired networking? Could this muck with the algorithm for guessing available bandwidth? But, the problematic hop seems to be one hop removed from this?
On 20/08/18 07:01, Mark Mielke wrote:
> Another interesting symptom... If I turn off JPEG on the fly for the second
> hop using the option panel, the artifact disappears as I mentioned earlier,
> but if I turn JPEG back on using the same mechanism, the artifact does not
> return. No more bad behaviour. What does this mean?
>
> 1) When it re-initialized the JPEG compression (noJpeg / qualityLevel), it
> does it better in some way the second time?
> 2) JPEG compression doesn't actually get re-activated?
That shouldn't happen, so sounds like some form of bug. Let's focus on
one thing at a time here, so avoid changing settings on the fly. :)
Have you verified that auto mode is properly disabled? Otherwise we
might be seeing some interaction with that.
Are you running the standard GNOME environment on both servers?
I noticed your recent patches related to the lossless refresh. I applied them and at first, there was no real change. However, when I removed "-QualityLevel 4" that I had been using to try and minimize the symptoms, so far so good. I'm not seeing the brief obscured pixels around the cursor when moving, or changing tint of colors (particularly green?) in a larger screen of text that is moving.I still have never managed to reproduce with a single hop. It's always with the two hops. It always seems worse moving my text cursor in my Xfce terminals than Firefox or Chrome (for which I normally don't notice any artifacts). I've also not heard a single complaint from any of the people that use this particular build, so I may be very unique with my double hop scenario (and could be unique in other ways?).I haven't had a lot of time to look into this myself, and with my symptoms possibly cleared with the default of "-QualityLevel 8", the itch to scratch might be gone (as long as I don't reduce the -QualityLevel in future!). It still seems like my scenario is able to trigger a bug, but I don't know for sure it is in TigerVNC. Perhaps it is in libjpeg on RHEL 7, or Xorg?I will try to find time to see if I can tickle this a little more and figure out what causes it. If you can think of anything more for me to try testing, please let me know.