Testing gateware 20201121_72p6 for TX latency and relay click debug released

548 views
Skip to first unread message

Steve Haynal

unread,
Nov 21, 2020, 8:53:03 PM11/21/20
to Hermes-Lite
Hi Group,

Please see the 20201121_72p6 testing gateware release. This is an experimental release with extra debug instrumentation to debug TX latency and relay click issues. It is intended primarily for software developers or technical people who want to tune or measure their HL2 network performance. Don't use it unless you are comfortable with Python and other software. You don't need to use it if setting TX buffer latency to 20ms and PTT hand to 12ms solves your relay click issues. I will add more information to this thread as I collect it.

73,

Steve
kf7o

Steve Haynal

unread,
Nov 21, 2020, 9:30:15 PM11/21/20
to Hermes-Lite
Hi Group,

This release sends a debug packet to a computer connected via port 1025 right after every regular packet which is sent to the running software. The debug packet contains enough information so that a value change dump file can be constructed. This is a typical waveform file used in digital logic simulation. The .vcd file for Quisk 4.1.73 running on a midgrade Ryzen system and capturing on a quiet wired network is shown below. 

In the capture below, we see that ptt_hang_time is 4 ms, tx_buffer_latency is 10 ms and packets are arriving from the PC at regular invtervals as indicated by pkt_cnt. The signal cmd_ptt is PTT/TUNE/MOX from the PC. When it becomes True, then tx_on immediately goes true. The signal tx_on controls the relays. Also, tx_wait goes True. While tx_wait is True, the TX buffer is allowed to fill with 10ms worth of data before transmit of any real signal begins. Since the relays are activated immediately with tx_on, this also allows time for the relays to settle. You can also see txfifo_ms filling up to ~10ms and then reach a steady state is where about 10ms remains in the buffer at all times, slowly drained but then refreshed with each arriving packet.
quisk_start.png

At the end of transmit, we see that cmd_ptt becomes False and the ~10ms worth of contents in the TX buffer are drained and sent. Signal tx_on stays True as transmit is still occurring. Once the TX buffer is empty and there is no more real data to send, tx_on still remains True for another ~4 ms as ptt_hang_time is set to 4ms. Zero (quiet) data is sent during this time.
quisk_stop.png


If we zoom out, we can see some times where packet jitter causes the TX buffer to empty more. Still, it never reaches zero. This is the purpose of saving data in a buffer. No silent TX periods occur as there was always real TX data to send. No relay clicks occur as the buffer never emptied and then exceeded the 4ms ptt_hang_time. The cursor is on the worst dip and shows 4.7ms of data still remain in the buffer. I also ran this test while the same PC was receiving HD video over the network and saw similar results. This indicates that I should be able to run Quisk with a buffer latency of about 6ms (the relay settling time) and experience no interruptions or relay clicks.
quisk.png

73,

Steve
kf7o

Steve Haynal

unread,
Nov 21, 2020, 9:44:31 PM11/21/20
to Hermes-Lite
Hi Group,

Continuing the analysis, below is a zoomed out screenshot of SparkSDR 2.0.3.9 running on the same Linux system. The TX start and stop sequences are as expected. Here we see the ptt_hang_time of 22ms which Alan mentioned was set by accident. The tx_buffer_latency is set to 10ms. We see a bit more packet jitter, but the tx_buffer never went down to 0 (0.7ms was the lowest), so no TX interruptions and no relay clicks. Remember, a relay click only occurs if the tx_buffer empties and the ptt_hang_time of 22ms in this case expires.
sparksdr.png

Next is a similar capture using linhpsdr. This is the latest from the github repository maintained by Matthew for the HL2 again running on the same Linux setup. The capture below was taken while that PC was also recording HDTV data (the news) from my networked HDTV receiver. Here we see that linhpsdr is setting the tx_buffer_latency to 21ms and ptt_hang_time to 4ms. This is very clean and the lowest we see the TX buffer dip to is 16ms. As with Quisk, I should be able to run this at 6ms latency with no TX interruptions or relay clicks.
linhpsdr.png 

73,

Steve
kf7o

Steve Haynal

unread,
Nov 21, 2020, 10:04:44 PM11/21/20
to Hermes-Lite
Hi Group,

A final interesting case to look at on Linux systems is pihpsdr. This is using the latest from the github which Christoph maintains with HL2 enhancements. It is on the same Linux system again with HDTV data being recorded at the same time. pihpsdr does not yet set the tx_buffer_latency nor ptt_hang_time so the defaults are used. In the zoomed out capture, it looks like there is considerable packet jitter yet the lowest the TX buffer reaches is 7ms.
pihpsdr.png 

This is because pihpsdr sends packets in groups of 8 and "prepopulates" the TX buffer. If we zoom in we can see that pihpsdr has sent enough data to fill the TX buffer to ~30ms. Also, the incrementing pkt_cnt shows that we are receiving 8 packets at a time.
pihpsdr2.png

With this method, one would not want to set the tx_buffer latency too high as the TX buffer could be prepopulated to exceed the ~80ms maximum capacity.

I plan to do similar analysis with Windows programs.

The point of this analysis is to show once and for all exactly what happens during relay clicks and hopefully end all complaints and post about relay clicks. For all 4 Linux software programs I tried, 10ms latency and 4ms PTT hold time did not result in relay clicks. In fact, I could have reduced latency to 6ms for most software. The HL2 is offering an advanced flexibility not seen in other openhpsdr protocol1 radios which allows users to set and tune latencies. If you are experiencing relay clicks, it is a quality of service of your network or your PC hardware which is most likely at fault. The solution is to:

* Set your TX buffer latency and PTT hang times higher via software or hermeslite.py if not supported by your software. This has been available for quite some time now. Future gateware will use higher defaults to handle some of the poorer setups reported.
* Tune your OS or other running programs to not cause indirect network delays.
* If problems still persist, install this testing gateware, run debug.py and provide the .vcd files to the group so that we can help debug.

73,

Steve
kf7o





Steve Haynal

unread,
Nov 22, 2020, 12:47:21 AM11/22/20
to Hermes-Lite
Hi Group,

Below is a capture of what happens if tx_buffer_latency and ptt_hang_time are set to 0. Notice that during TX the tx_on signal becomes False briely at a few times. These are relay clicks.

73,

Steve
kf7o
quisk0.png

didier....@gmail.com

unread,
Nov 22, 2020, 2:11:49 AM11/22/20
to Hermes-Lite
Hello Steve,

Today i will perform some tests and will will provide some feedback ASAP. 

Currently i am using the P5 and have no noticeable issue on my WIFI network when using remotely. With my Hermes lite feed directly with a Lan cable since the Gateware 70 i have no issue at all (Quisk and Pihpsdr with RPI 3b+)

73s 

Didier

F5NPV

didier....@gmail.com

unread,
Nov 22, 2020, 3:16:49 AM11/22/20
to Hermes-Lite
Hello ,

After uploading the P6 gateware , this is basically the test i perform from a remote location through my WIFI network

Wifi : 
i have a repeater in my shack (The brand is Aigital , definitely not a top notch device) and this repeater is providing 2 Lan Port with one directly plug to my RPI 3b+ and the other one to My hermes Lite.
This repeater is link to my main WIFI router located in my living room and my remote WINDOWS 10 Computer is connected to this router ; It means i am connected to the Hermes lite though 2 devices (The main router and the repeater)


Condition of testing (Material):
-Hermes lite connected to the repeater in the Shack
-barefoot ouput power about 4W in a dummy load and very quick test on the air 
-Monitoring device is a TS520 with a very quick tiny antenna
-Remote location is upstair and the Wifi signal level from my main router is about 90% . i did a scan prior to the test and the Wifi channel was clear.
From remote i can supervise my output power with a web page and definitly if there is some glitch and relay chatter i can notice the issue also with this remote capability


SPARK : Release 2.0.3.0
i try to find if there is some Buffer or hang time setup but cannot find it, so the setup is stock
Tune and SSB test :
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking
FT8 Test just TX:
20m Dummy load and on the air = OK no clicking but a harsh noise at the end of the transmit

PowerSDR V3.5.0 BETA 8:
Tune and SSB test with a  TX bUFFER = 31 and Hang Time = 10 :
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking
Tune and SSB test with a  TX bUFFER = 20 and Hang Time = 12 :
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking
Tune and SSB test with a  TX BUFFER = 6 and Hang Time = 1 :
80M Dummy load and on the air = NOK Relay clicking --> Not good at all
40M Dummy load and on the air = NOK Relay clicking --> Not good at all
20M Dummy load and on the air = NOK Relay clicking --> Not good at all

THETIS V2.8.11:
i try to find if there is some Buffer or hang time setup but cannot find it, so the setup is stock
Tune and SSB test :
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking


QUISK V 4.1.72:
i try to find if there is some Buffer or hang time setup but cannot find it:
-In timing and CW setup my play latency is 500 
-hardware poll is 15000
-keyup delay is 23ms
Tune and SSB test :
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking

SDR CONSOLE V3.0.21
Setup for transmit OPTIONS --> TX-TONE-TUNE-->RF delay
tx on = 10ms
TX off = 10ms
80M Dummy load and on the air = OK no clicking
40M Dummy load and on the air = OK no clicking
20M Dummy load and on the air = OK no clicking

James Ahlstrom

unread,
Nov 22, 2020, 6:51:44 AM11/22/20
to Hermes-Lite
Hello Steve,

This is some very impressive work! Thanks for taking the time to do this. It must have been very tedious.

Jim
N2ADR

PA3GSB

unread,
Nov 22, 2020, 9:06:21 AM11/22/20
to Hermes-Lite
Hi Steve,

Again a very nice utility, impressed!

Really nice to bring the internal state of the HL visible. 

Repeating the experiment on a W10 machine. Iam also using a virtual box on the W10 machine.  To make it work  i had to  disable the ethernet driver for the VirtualBox Host-Only Ethernet Adapter!!!!

Running sparkSDR results in :



73 Johan
PA3GSB

Op zondag 22 november 2020 om 12:51:44 UTC+1 schreef jah...@gmail.com:

PA3GSB

unread,
Nov 22, 2020, 9:08:29 AM11/22/20
to Hermes-Lite
gtk.JPG

sorry picture not attached

Op zondag 22 november 2020 om 15:06:21 UTC+1 schreef PA3GSB:

Alan Hopper

unread,
Nov 22, 2020, 9:59:30 AM11/22/20
to Hermes-Lite
Hi All,
just for reference the recent versions of spark have the buffer settings in the radio settings so you have to set it for each radio if you have more than one.
73 Alan M0NNB

Steve Haynal

unread,
Nov 22, 2020, 11:24:59 PM11/22/20
to Hermes-Lite
Hi Group,

I ran the same analysis on Windows software today. Overall, the results were slightly better than Linux. The Windows 10 machine I used has a i7-8850H, 6 cores/12 threads, and a passmark score of 10403. The Linux machine is a Ryzen 3200G 4 cores/4 threads and a passmark score of 7241. Both were connected to the same network with only a gigabit switch between the computer and HL2. Perhaps the better CPU in the Windows 10 machine explains the difference. Perhaps the Windows machine was less loaded as I still captured the debug data on the Linux machine.

I did find one serious issue with PowerSDR v3.5.0_Beta_8. Although this version has the ability to set TX buffer latency and PTT hang time, it always zeros these values out in the HL2 at startup. I had to manually go and just increment/decrement both TX buffer latency and PTT hang time every time I started PowerSDR. This could explain why PowerSDR users complain often about relay clicking. Reid, please check PowerSDR initialization code and make sure the TX buffer latency and PTT hang time fields are not initialized to zero.

Below is the result for Quisk 4.1.73 with Python 3.8. The TX buffer never dipped below 8.7ms during TX. This implies we could run with 2ms latency, if we have a way for relays to settle that quickly.
quisk.win.png 

Below is the result for SparkSDR 2.0.3.9. I did adjust the TX buffer and PTT hang times from within SparkSDR to be 10ms and 4ms. The TX buffer never dipped below  8.7ms. There are very few dips to 8.7ms. SparkSDR produced the cleanest debug output of all the Windows software I tried.
sparksdr.win.png

Below is the result for PowerSDR v3.5.0_Beta_8, the latest from Reid's github repository. This was after I set values to 10ms and 4ms. The TX buffer never dipped below 6ms. From the pkt_cnt values, PowerSDR is not sending groups of packets but does appear to drift and send packets slightly slower than expected so that the buffer drains down to 6 ms. But then it catches up and the process repeats. When it catches up, it may prepopulate so that the buffer is filled to 14ms This explains the greater long term jitter seen. Also, in the 4th transmit, it never prepopulated up to 14ms but only to 10ms. The time in the buffer is noticeably lower for the 4th TX. I don't have an explanation for that.
powersdr.win.png

Every time I started PowerSDR, the TX buffer latency and PTT hang time values were zeroed out. The result below shows this. Here you can see tx_on thrashing. This is what causes relay clicks. I believe this may be why there are continued complaints of relay clicking in PowerSDR: it zeroes out tx_buffer_latency and ptt_hang_time values at startup.
powersdr.win0.png


Below are the results for SDR Console. I used the latest public 64-bit release 3.0.25. Although this release does not have the capability to set TX buffer latency and PTT hang times, it does not overwrite the values so the default of 10ms/4ms applies, or whatever the last software set the HL2 values to. The TX buffer never drops below 7ms.
sdrconsole.win.png

Below we see what is happening if we zoom in to the SDR Console results. Here we see some timing variation between when packets are sent and when they are received. But it never drifts far, and the 7ms value implies we could still run this setup with a TX buffer latency as low as 3 or 4 ms if using the low power output. The high power output with relays does require at least 6ms for the relays to settle.

sdrconsole.winz.png


73,

Steve
kf7o

Reid Campbell

unread,
Nov 25, 2020, 4:19:16 AM11/25/20
to herme...@googlegroups.com
Hi Steve,

I had a look at this and I can see the value being setup on connection
to the HL2. What I did discover was that a double masking of the bits
was happening, so that the max latency delay is limited to 5 bits (31)
and the value will roll over on values above that (32 gives 0 latency).

I'm wondering if there is something in the Power SDR database giving the
issue. I would recommend a database reset to see if that cures the zero
latency issue.

For now, could anyone using PowerSDR Beta 8 limit the Tx Buffer Latency
to 31 or less and I'll get a fix out soon.

Cheers

Reid
Gi8TME/Mi0BOT

Chris Gerber

unread,
Nov 25, 2020, 5:22:41 AM11/25/20
to herme...@googlegroups.com
Yes Geir had to do it here.

73 Chris hb9bdm

Steve Haynal

unread,
Nov 27, 2020, 1:26:26 AM11/27/20
to Hermes-Lite
Hi Reid,

I'll try a database reset. This Windows computer did have an old version of PowerSDR from several years ago which I removed before trying your version but without a database reset.

73,

Steve
kf7o

Reid Campbell

unread,
Nov 29, 2020, 3:50:16 AM11/29/20
to herme...@googlegroups.com
Hi Steve and Group,

I'm now also seeing the issue of the Tx latency and PTT hang being set
to zero and I have a handle on why. The work around at the minute is to
do as Steve did and change both values after you connect to the HL2.

Cheers

Reid
Gi8TME/Mi0BOT

Steve Haynal

unread,
Dec 1, 2020, 2:24:15 AM12/1/20
to Hermes-Lite
Hi Reid,

Thanks for digging into this. I never did get around to resetting the database. Do you think this problem could occur in all versions of PowerSDR, including the stock versions?

73,

Steve
kf7o

Reid Campbell

unread,
Dec 1, 2020, 5:48:02 AM12/1/20
to herme...@googlegroups.com
Hi Steve,

It was the first time I had to add a new protocol item, so it won't affect the older versions.

Cheers

Reid
Gi8TME/Mi0BOT
--
You received this message because you are subscribed to the Google Groups "Hermes-Lite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hermes-lite...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hermes-lite/d9724fdb-703c-451b-a742-1709f79bc458n%40googlegroups.com.

W. Jozef

unread,
Dec 7, 2020, 9:30:15 PM12/7/20
to Hermes-Lite
Hi.
Is there a hermeslite.py version (with user manual) that works fine under Windows 10.
What would allow to fix the distorted signal and clicking relays for example in Console V3 by changing the buffer and delay?
73, Joe

Steve Haynal

unread,
Dec 8, 2020, 12:55:06 AM12/8/20
to Hermes-Lite
Hi Joe and Group,

The data in this thread shows that SDR console and all other software tested work just fine in a decent network with the default TX buffer latency of 10ms and PTT hangtime of 4ms. There was one issue with PowerSDR zeroing out these values but that has been fixed by Reid. If you are still having network quality-of-service issues, I suggest first asking Simon for a version of SDR console where he has set TX buffer latency to 20ms and PTT hangtime to 12ms:

The debug.py utility which uses hermeslite.py can be run from a Windows command prompt if Python3 is installed. It does not need to be run on the computer running the SDR software. You can run it on a Linux machine provided it is connected to the same network. The documentation is the code itself as well as:

For those with programming and at least some Python background, this will all make sense. Several people have already run the utility. I don't have the time, desire or teaching expertise to get into the Python education business. There are many well-done Python tutorials for both Windows and Linux already on the internet for those who wish to learn. If someone makes a valiant attempt to use the utility and has a specific question about a problem they encounter, then I will try to answer the question.

73,

Steve
kf7o 

ron.ni...@gmail.com

unread,
Dec 8, 2020, 10:15:41 AM12/8/20
to Hermes-Lite
If you hear relays clicking during transmit, please first check the network connection between your computer and the HL2.  Transmitting with the HL2 requires a cleaner, more uniform latency network connection than even high quality video streaming or teleconferencing.

If the variation in network packet latency is is too high on your network connection (or there are lost or swapped packets), then changing the buffer and delay won't help.  Try a long sequence of pings to the HL2 IP address, If you see round-trip ping times jumping by more than, say, 20 mS above the average, then you probably need to improve your network connection to the HL2 before modifying HL2 parameters.

73,
Ron
n6ywu
On Monday, December 7, 2020 at 6:30:15 PM UTC-8 W. Jozef wrote:

Steve Haynal

unread,
Dec 8, 2020, 2:11:49 PM12/8/20
to Hermes-Lite
Hi Ron and Joe,

The jitter recommendations for video and voice I could find are from Cisco for 30ms or less one way. These numbers are for a WAN, not a LAN. The old HL2 buffer was 42ms deep, so could handle +/-20ms jitter which is reasonable in my opinion for a LAN. The new HL2 buffer since several months is adjustable across 85ms, or +/-40ms, which exceeds the recommendations from Cisco. So I disagree with your claim that the HL2 requires higher QOS than video or teleconferencing.  

I suspect there are two reasons for this problem when it occurs for a few people. First, the network setup may be poor and not meet the Cisco jitter requirements of <30ms. Ping (especially ping flood for Linux as discussed before on this list) can help determine the maximum jitter seen. Seconds, computers and software may add jitter that exceeds the recommendation. For example, if the computer is too slow or overworked, if another application is being run which hogs resources, or the software is poorly written. My testing in other threads show that all 6 software packages I tried do not introduce more than 10ms of network+computer combined jitter when a midrange PC and gigabit network are in use.

There are 3 software packages I know of which allow the user to adjust the TX buffer latency and PTT hangtime: Quisk, SparkSDR, and Reid's PowerSDR port. All three run on Windows. The first two run on Linux. If people are having QOS problems and can't run hermeslite.py/debug.py, I suggest trying each of these programs and reporting back the settings required to eliminate problems. 

In future gateware I will increase the default values from 10ms/4ms to 20ms/12ms, but I think it is better if people understand these settings and learn how to adjust them for their setup.

73,

Steve
kf7o

ron.ni...@gmail.com

unread,
Dec 8, 2020, 3:55:28 PM12/8/20
to Hermes-Lite
Hi Steve,

An 85 mS buffer size does help with network latency issues over wired networks, but most of the WiFi networks I've tested (multiple locations) have UDP latency variations that can jump over 100 mS.  So an 85 mS max buffer will still cause HL2 Tx to glitch.  Also note that these WiFi networks are in households where HD video streaming, FaceTime, Zoom, et.al. seem to work just fine, likely due to some combination of FEC, error concealment strategies, and buffering used by the encoders/decoders.

Also, you may be jumping ahead in terms of which builds of gateware most current HL2 users are running.  You may want to gather some user statistics before you can assume that HL2s in the field are running gateware builds above 67 or 70 which include the larger buffers.

73,
Ron
n6ywu

Steve Haynal

unread,
Dec 8, 2020, 6:51:32 PM12/8/20
to Hermes-Lite
Hi Ron and Group,

I don't think it is fair to compare the HL2 to WiFI devices with gigabytes of memory and fast processors. The HL2 is a wired gigabit network device, and designed to and does work well in that scenario. The HL2 has no processor and only 74 kilobytes of memory total, which must be also be used for DSP. A modern WiFi tablet or phone has gigabytes of memory plus several fast processor cores, and is likely employing relatively huge buffers compared to the HL2 to mask the large latencies from WiFi when streaming. I have been very careful to never say that the HL2 always works with WiFi on any of the pages, wikis or group posts. I think the only way for WiFi to consistently work well with the HL2 is by doing what you have done with hl2_tcp and pairing the HL2 with a fast processor and ample memory.

I don't think I assume a recent gateware version. I mention the historic 42ms TX buffer size. All my testing worked with 10ms TX buffer sizes which is well within the older 42ms available.

With hermeslite.py, debug.py and 20201121_72p6 gateware, there are now tools to measure exactly what is going on in a HL2 regarding TX latencies and relay clicks. Although it may not be easy for everyone to run these utilities, several people already have and at least another dozen on this list can do so easily. But I think the number of people experiencing problems is very small, especially among those who can easily run the utilities, so reports are low. I am interested in more VCD files posted by any people seeing issues.

ron.ni...@gmail.com

unread,
Dec 8, 2020, 10:19:23 PM12/8/20
to Hermes-Lite
Hi Steve,

I'm certainly not disparaging the architectural decisions or engineering design of the HL2, which I find to be excellent.

However, it is a networked SDR.  And I have found it important to temper expectations that just because a consumer computer system and network configuration might be capable of supporting 4k videos or multiple 100 Mbs downloads, it can also support simple remote keyed CW Tx with an HL2.  The network requirements, given the current protocol, are subtly different, and have to be checked, not assumed, to be appropriate.  One of my HL2s is out on traveling loan, and I've had to help debug several of these network issues, even in a QTH with a brand new router and 1 GB fiber installation.

73,
Ron
n6ywu

Steve Haynal

unread,
Dec 8, 2020, 11:56:50 PM12/8/20
to Hermes-Lite
Hi Ron,

Can you please provide more details regarding the network issues you ran into?

I agree that it is good to check the network QOS for the HL2. This is why the HL2 supports ICMP, I wrote hermeslite.py and debug.py, and am constantly reminding people of the TX buffer latency and PTT hangtime settings. I don't think I am assuming anything, but do think it is the user's responsibility with various software to check network QOS, not the HL2's. 

It would be great to have a more user friendly and accessible software to do some specific HL2 network checks if you or someone else would like to write this. The HL2 supports ICMP (including flood) and discover requests. Both of these have deterministic and little delay within the HL2 FPGA. The HL2 returns statistics for the TX buffer during operation. Software could measure, analyze and present this to a HL2 user in a meaningful way. I am willing to make low resource cost changes to the gateware to enhance and support such an effort.

73,

Steve
kf7o

ron.ni...@gmail.com

unread,
Dec 9, 2020, 1:48:08 AM12/9/20
to Hermes-Lite
Hi Steve,

The current ICMP ping support is already very helpful for debugging.  Most of HL2 relay clicking and ep6 error issues I've seen correspond to finding a large difference between average and max ping round-trip times (sometimes 70 mS and up).  

This can occur even on a computer and with a network setup that can reliably support many 100's of Mbs of bidirectional network throughput.  Sometimes, there's a "forgotten" fast WiFi link somewhere in the network setup.  e.g. someone unplugs a network cable, but since the computer still has a blazing fast network connection, assumes that the connection to the HL2 is just as reliable (in terms of latency).  But then a neighbor starts hammering the WiFi channel in use.  Once, just switching channels fixed the relay clicking.

73,
Ron
n6ywu

ron.ni...@gmail.com

unread,
Dec 9, 2020, 2:16:20 AM12/9/20
to Hermes-Lite
So, perhaps the "Getting Started" documentation should suggest that users to listen for any extended relay clicking during their initial HL2 transmit testing.  Then, if detected, request that they gather some stats from extended pinging for further diagnosis of their computer or network topology being the possible or probable cause of the issue (not the HL2 or SDR software).
73,
Ron
n6ywu

ron.ni...@gmail.com

unread,
Dec 9, 2020, 2:35:15 AM12/9/20
to Hermes-Lite
Also, as a possible suggestion for HL2 software developers (including myself):  Usually the Tx dropouts and relay clicking happen in conjunction with detectable ep6 errors (but not sure if always?).  
So if software detects ongoing or increasing ep6 error counts during receive of a just single slice or two, it might be appropriate for the software app to pop up some sort of warning that Tx over the current network configuration might not be the best idea in terms of generating a clean Tx signal.  Or perhaps even disable Tx if the current ep6 error rate goes high enough?
73,
Ron
n6ywu

Matthew

unread,
Dec 9, 2020, 3:03:14 AM12/9/20
to Hermes-Lite
Hi Ron,

The best way to know if relay chatter has occurred is to look at the TX IQ FIFO Count MSBs (see protocol wiki). I have proven this to be reliable on a bench with eyes and ears. I have undertaken fairly extensive testing and an EP6 packet number out of sync does not always coincide with relay chatter.

I'm not sure I follow your discussion in this thread. Are you reporting that you have seen problems with a 1 GB fibre connection from SDR app to HL2? Which SDR software was this using? Did you run Steve's diagnostics on this per this thread?

I believe there are over 500-600 HL2s in the wild now. The number of reports of HL2s having relay problems seems to be relatively small, most of these seem to have been fixed with gateware upgrades and SDR software changes.

73 Matthew M5EVT.

ron.ni...@gmail.com

unread,
Dec 9, 2020, 4:05:26 AM12/9/20
to Hermes-Lite
Hi Matthew,

Yes, I stand corrected.  The FIFO count is a much better indicator of relay clicking and Tx dropouts, after the Tx starts, due to latency variability issues.

However a couple times when I saw or someone reported relay clicking, the clicks happened in conjunction with a rising ep6 count.
Indicating the existence of two very different classes of network issues.  Delayed packets, versus dropped or reordered UDP packets..

And some of the issues I've seen cannot be fixed with gateware changes.  The latency variations are too large.  That seems to indicate that the network infrastructure or computer (OS network QoS ?, etc.) needs to be debugged/fixed first.

Thanks and 73,
Ron
n6ywu

Matthew

unread,
Dec 9, 2020, 4:16:33 AM12/9/20
to Hermes-Lite
Ron,

There is one subtle point to note with reading the reported fill of the FIFO. The message linked from the wiki describes the reported levels in good detail. You will notice that in transition from RX to TX, for the first X reports to the PC the reported value is filling up. Sometimes I have falsely reported that as "chatter" in linHPSDR. Best to ignore the first X packets (X depends on your requested FIFO size) during RX to TX transition. This is especially important if you want to dynamically resize the FIFO based on reported level.

73 Matthew M5EVT.
Reply all
Reply to author
Forward
0 new messages