Sleep calls in thread funcs

134 views
Skip to first unread message

Alice

unread,
Nov 11, 2020, 5:07:46 PM11/11/20
to libplctag
Hi, 

I'm running libplctag with cgo and it's using too much CPU. Or at least, it doesn't seem to be giving up the CPU to other processes. I wanted to increase the duration of some sleep calls in the code, but wasn't sure what exactly some of the parent functions did and if it would greatly affect execution. 

Some more context: I'm basically opening a connection with an AB PLC once, and then reading a set of tags every second. Writes to tags happen sporadically and much less frequently. Ideally the connection is never closed and we keep reading tags at 1Hz.

The 2 main places where I wanted to lengthen the duration of sleep calls are:

1. THREAD_FUNC(session_handler) in ab/session.c
This seems to keep track of the session state, and the opening/closing of connections with the PLC. After I open the connection, it seems to stay in SESSION_IDLE state. If I'm not concerned about closing the connection, is it actually doing anything useful in the switch statement SESSION_IDLE case or can I increase the sleep_ms call to something like 100ms? 

2. THREAD_FUNC(tag_tickler_func) in lib/lib.c
This implements auto-reads and auto-writes and callbacks, but I'm not using any of that functionality. It also calls the tickler func for all tags, but why does tag->vtable->tickler(tag) need to be called every ms in this THREAD_FUNC? It doesn't seem useful for my use case, which is calling Read() and then Status() and then GetValue() on each tag. If it's not useful in my specific scenario, then it also seems like I can increase this sleep_ms call to 100ms. 

Thank you,
Alice

Kyle

unread,
Nov 12, 2020, 12:33:03 AM11/12/20
to libplctag
Hi, thanks for using the library.

See inline for comments.

On Wednesday, November 11, 2020 at 2:07:46 PM UTC-8 Alice wrote:
Hi, 

I'm running libplctag with cgo and it's using too much CPU. Or at least, it doesn't seem to be giving up the CPU to other processes. I wanted to increase the duration of some sleep calls in the code, but wasn't sure what exactly some of the parent functions did and if it would greatly affect execution. 

Some more context: I'm basically opening a connection with an AB PLC once, and then reading a set of tags every second. Writes to tags happen sporadically and much less frequently. Ideally the connection is never closed and we keep reading tags at 1Hz.


This is a fairly standard approach.   Are you using async read/write or are you adding in timeouts?
 
The 2 main places where I wanted to lengthen the duration of sleep calls are:

1. THREAD_FUNC(session_handler) in ab/session.c
This seems to keep track of the session state, and the opening/closing of connections with the PLC. After I open the connection, it seems to stay in SESSION_IDLE state. If I'm not concerned about closing the connection, is it actually doing anything useful in the switch statement SESSION_IDLE case or can I increase the sleep_ms call to something like 100ms? 

I would not change this one.  This thread is used to send and receive packets to and from an individual PLC (one thread per PLC).   If you increase the sleep time here, you will delay packet processing.   All packet processing is done asynchronously, so adding delays here can significantly impact performance. 

Different OSes have different performance characteristics for networking and many will provide partial data (particularly from PLCs which have terrible network performance in general).   So if you increase this to 100ms, you may find that you get half of a packet, then wait 100ms and then get the next half. 

Due to the way that both the AB and Modbus protocols work, each PLC processes one request packet at a time (which is why the library does request packing where possible).   So if you cause the delays between them to be large, you will drop the overall performance significantly.   I've done it by accident many times while developing the library!
 

2. THREAD_FUNC(tag_tickler_func) in lib/lib.c
This implements auto-reads and auto-writes and callbacks, but I'm not using any of that functionality. It also calls the tickler func for all tags, but why does tag->vtable->tickler(tag) need to be called every ms in this THREAD_FUNC? It doesn't seem useful for my use case, which is calling Read() and then Status() and then GetValue() on each tag. If it's not useful in my specific scenario, then it also seems like I can increase this sleep_ms call to 100ms. 


This one you can possibly change a bit, but again doing so will impact overall performance.   If you use async mode when calling plc_tag_read() or plc_tag_write(), then any additional delay here will impact overall request processing, particularly of larger tags where multiple requests are required.

Note that in general, delaying 1ms is not actually delaying 1ms.   This is done as a platform-independent way of causing the OS to rechedule the thread thus handing back the CPU to some other thread.   I have measured the actual time taken on loaded systems and it can vary from 1 to 50ms before the internal library threads run again.   On a lightly loaded system, you may see an average closer to 1ms taken between runs of a thread.

I am interesting to understand more about the structure of your program.   I have run the library on relatively underpowered systems in the past such as Raspberry Pi and MIPS emulated under QEMU.    Even on relatively slow Windows tablets (Bay Trail CPU), I see less than 1% CPU time when running tests with many tags all reading and writing simultaneously.   That said, I think all but the MIPS system have more than one real or simulated CPU.

I have very little experience with Go, so I would like to understand the platform better.   It is always possible that there is some unexpected interaction between Go and the library!  Go's goroutines may be doing things that do not work well with the underlying threads or vice versa.  Go tends to take over the full set of CPUs it is given, so perhaps that is part of the problem?

What version of the library are you using?  What OS and CPU is your program running on?  

Thanks for raising this issue.  I hope you will understand if I ask a lot of questions. I want to make sure that the library works well with Go!

Best,
Kyle
 
Thank you,
Alice

Alice

unread,
Nov 16, 2020, 8:49:57 PM11/16/20
to libplctag
Hi Kyle, thanks for the quick response! I am running Go on a single core Linux processor, so I'm concerned that the THREAD_FUNCs running in the background are actually taking up a lot more processor time than they need to be. I have been testing with 2.1.17 (was previously running 2.0.7). I am adding in timeouts, not using async. 

When you say that adding delays in the session handler can significantly impact performance, is the performance referring to the amount of time it takes a read call to complete? As long as nothing will start returning errors, I'm ok with read/write calls taking longer if the overall CPU time goes down. Aside from longer wait times for function calls, is there anything that would be negatively impacted by these longer sleep calls? Since the data frequency I'm concerned with is 1Hz, I imagine that delays on the order of 10s of ms will not have a huge impact. 

Best,
Alice

Kyle

unread,
Nov 16, 2020, 10:59:23 PM11/16/20
to libplctag
Hi Alice,

On Monday, November 16, 2020 at 5:49:57 PM UTC-8 Alice wrote:
Hi Kyle, thanks for the quick response! I am running Go on a single core Linux processor, so I'm concerned that the THREAD_FUNCs running in the background are actually taking up a lot more processor time than they need to be. I have been testing with 2.1.17 (was previously running 2.0.7). I am adding in timeouts, not using async. 


Hmm...   Single core.   That may change things.  I am surprised that Go works very well on a single core.   I guess later versions of Go use preemptive goroutines?  How powerful is that single core?
 
When you say that adding delays in the session handler can significantly impact performance, is the performance referring to the amount of time it takes a read call to complete? As long as nothing will start returning errors, I'm ok with read/write calls taking longer if the overall CPU time goes down. Aside from longer wait times for function calls, is there anything that would be negatively impacted by these longer sleep calls? Since the data frequency I'm concerned with is 1Hz, I imagine that delays on the order of 10s of ms will not have a huge impact. 


I think you can reasonably increase the delay time if you really only need 1Hz.  

How many tags are you reading?   If it is a lot, then you probably want to do something like the code in async_stress.c.

I would try to adjust just one of the threads at a time.   Start with the main background thread (not the PLC/session thread) and see if changing that helps.   This is the sleep_ms() call in src/lib/lib.c in tag_tickler_func() at about line 368.  Try 5 milliseconds.

If that does not give you much CPU back, try increasing the sleep in src/protocols/ab/session.c:

In the session_handler function, there is a sleep_ms() at about line 1209.   You can change that one to about 5ms without too much impact.  

There are two more in the functions that send and receive packets.   I would attempt to leave those alone at first.  Only change those if you really need to.   Generally they will only be active once in a while (when packets are either in flight or needing to be sent).

Best,
Kyle

Kyle

unread,
Nov 26, 2020, 12:26:11 PM11/26/20
to libplctag
Hi Alice,

Another user opened an issue about the same problems with embedded systems.  See issue 217.

I will be working on supporting some combination of compile flags to do at least part of this.  

Best,
Kyle

Alice

unread,
Nov 30, 2020, 6:17:39 PM11/30/20
to libplctag
Thanks for looking into this and answering all my questions! I increased most sleeps to 5ms, and that helped significantly (and performance didn't drop too much). 

It also seems that calling into cgo is very expensive in Go. I think autoread may help in that respect. Is there an easy way to create batched functions for getting the status or value of tags? That way I can call into cgo once and get information for all the tags, instead of calling into cgo n times, where n is the # of tags. 

Alice

Kyle

unread,
Dec 1, 2020, 12:29:01 AM12/1/20
to libplctag
Glad to hear that it seems to work with the longer wait times!

I have been experimenting a bit to find out what the impacts of these wait delays are.   So far, it looks like I can increase the delays to about 3ms without a large  impact on performance.   However, even moving to 2ms has an impact in some situations.   I need to keep looking to see what else can be done.   I used non-blocking IO and simple short sleeps to keep the code simple and clean and to make sure that I did not create a lot of threads.

Thanks for the information about cgo.   I did not know that it had such a cost.   Is it possible for you to do a small amount of work in C?   It might make sense to set up auto-sync tags and then hook in a C callback function on read completion. 
This will eliminate almost all of the mutex operations that happen when you call libplctag API functions.   I think that reducing the total number of calls into the library (and thus cgo and mutex overhead) is probably the only way to really lower the cost.  

If you try to put all the tags into one giant tag on the PLC and read that, you will reduce the number of library API calls a little bit, but you will still need to read out the data from the internal tag buffer in memory.   Each call to plc_tag_get_Xint() will have the cgo and mutex overhead.

Note that in general Linux has exceptionally low mutex overhead.   Even on embedded systems it should be low.   But it is there and the interaction with Go may be a problem.

Best,
Kyle
Reply all
Reply to author
Forward
0 new messages