Feature Request? Auto-restart/retry of goesproc after losing data stream from goesrecv

83 views
Skip to first unread message

Brad Bowers

unread,
Oct 27, 2020, 10:14:49 AM10/27/20
to goestools-users
So I have somewhat of a unique setup. I have a Raspberry Pi out by the dish which runs goesrecv. Then on a virtual machine with a bunch of cpu cores and memory I have a debian server which runs goesproc and processes my gifs/images, etc.

Currently the Pi is NOT on a battery backup. Our power will blip off for a second, enough to reset the Pi... this happens weekly. When this happens the Pi restarts and automagically starts up a screen session and starts goesrecv.

The goesproc VM however stalls out and doesn't process anything further until I kill goesproc and restart it. Is there any way to detect a lost data stream from the goesrecv process and keep retying to connect to it? I also have no way of knowing other than logging in to the VM and attaching to screen to see if it stopped processing images...

If this was the case I could run everything in screen on boot and not need to worry about a momentary power blip from the SDR Pi, everything would start up and start retrying to connect as needed.

Just a thought, hoping for some help here!

Phil Biehl

unread,
Oct 27, 2020, 12:04:46 PM10/27/20
to goestoo...@googlegroups.com

Brad,

I have the same remote setup as you but with the exception that goesproc runs on the pi as well. It then dumps the files to a mounted NAS drive inside the house (over WiFi). Works fine for me and does not have the problem you describe since everything is running on the pi. But, then again, I’ve only had this running for a couple of months now.

 

Does this help?

Phil

--
You received this message because you are subscribed to the Google Groups "goestools-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to goestools-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/goestools-users/cf09d7d2-8525-4a01-85dd-0fd6b0e5f759n%40googlegroups.com.

Brad Bowers

unread,
Oct 27, 2020, 12:17:55 PM10/27/20
to goestools-users
Hi Phil,

So my only issue with that is I have just enough latency when writing to the NAS from the Pi itself over the wifi that it ends up dropping stuff because it's just a little too slow. This problem was alleviated when I moved the processing side of things to the VM. So I'm trying to avoid that scenario again. I could run ethernet all the way out there but that's an ordeal I don't have time to manage at this point, so wifi it is.

--
Brad

Phil Biehl

unread,
Oct 27, 2020, 1:51:53 PM10/27/20
to goestoo...@googlegroups.com

Brad,

Are you running your WiFi at 2.4 or 5.4GHz? Possibly the faster speeds of 5 would work better for you?

Brad Bowers

unread,
Oct 27, 2020, 2:25:16 PM10/27/20
to goestools-users
2.4GHz because it's a Pi3b. No intention on changing the Pi out for a newer one, I really don't need it to be newer. Either way, it would be nice to get the functionality for goesproc to realize it lost the stream from goesrecv and somehow attempt to retry or restart itself.

Pieter Noordhuis

unread,
Oct 28, 2020, 3:34:55 PM10/28/20
to goestoo...@googlegroups.com
Hey Brad,

That's not great... These tools use a library called nanomsg to manage connections. It waits/retries connecting if the source (goesrecv) isn't ready yet. Also, if the source drops out, it should automatically try and reconnect.

If the pi gets a power blip, I'm sure that normal TCP connection termination doesn't happen, and your goesproc VM doesn't see any FIN or RST packets for that connection. Therefore it needs to wait for the connection to timeout before it can confidently decide that the source has gone away and it needs to reconnect. Perhaps this timeout is set to be very long, making it look like it never resumes.

Have you ever seen it resume, or does it always stay stuck, even for hours at a time?

Anyway, knowing what causes the problem doesn't also solve it. I looked at what options we have in nanomsg to detect this condition and have come up blank. So, we'll have to resort to something out of band to solve this.

The first thing that comes to mind (related to above), is tuning the TCP keepalive time. On many Linux systems, the default keepalive time is 2 hours. If this is indeed what's causing the issue, I would expect goesproc to resume after 2 hours and a couple of seconds. If it doesn't, then this solution won't work either. You can check the default TCP keepalive time on your system by running "sysctl net.ipv4.tcp_keepalive_time". The returned value is in seconds. If this is indeed 7200, you can try reducing it to 5 minutes or so, by running "sudo sysctl -w net.ipv4.tcp_keepalive_time=300". This will set the default keepalive time to 5 minutes for ALL connections on your system -- not just the goesproc one.

The second thing that comes to mind (if the first doesn't solve it), is to add some sort of watchdog to goesproc itself. If goesproc hasn't seen a packet in N seconds, crash. Then you can rely on your process manager to restart goesproc.

Could you give the first solution a try, and see what happens?

Cheers,
Pieter


Brad Bowers

unread,
Oct 30, 2020, 8:25:24 AM10/30/20
to goestools-users
I will have some time to do some testing this weekend... I'll change it to 5 mins... power off the pi outside for a few mins simulating a brief power outage and see what happens to the goesproc instance inside when it comes back online. More to follow!

--
Brad
KF2I

Emmett Kyle

unread,
Nov 4, 2020, 11:10:34 PM11/4/20
to goestools-users
Interesting. I've seen these problems as well. I have a similar setup. I have a Pi B+ running goesrecv and a Pi 4 running goesproc with an attached external HD. This setup easily handles streams from both GOES16 and GOES17. The goesrecv Pi has a consistent load of 3 of the 4 cores. It is also running telegraf for data collection. Another option might be to do something outside of goestools, like ping the goesrecv host every so  often and if it doesn't respond, wait until it responds and then restart goesproc.

Brad Bowers

unread,
Nov 15, 2021, 7:20:39 PM11/15/21
to goestools-users
Bringing this back to life... My goesproc server keeps losing the feed from the goesrecv running on the separate raspberry pi. Is there a way to get it so I don't have to keep logging in and restarting goesproc on the server?
Reply all
Reply to author
Forward
0 new messages