errors when connecting with UaExpert

4,157 views
Skip to first unread message

Peter Speybrouck

unread,
Apr 25, 2015, 11:13:42 AM4/25/15
to open...@googlegroups.com
Hi,

I was just giving the server a quick spin but I'm getting some errors when connecting to the server with UaExpert.

2 situations:
1. 64-bit windows version (downloaded from the website)
   => UaExpert connects, but server log shows these lines:
[04/25/2015 14:57:15.769.000] info/communication        Unknown request: NodeId(ns=0, i=787)
[04/25/2015 14:57:15.771.000] info/communication        The message was not entirely processed, skipping to the end

while UaExpert logs show:
[uastack] OpcUa_Channel_OnNotify: Underlying connection raised unexpected error event with status 0x80050000!

2. server compiled and run on Olimex Olinuxino Lime2
first time I saw the same server logs but when I tried to reproduce I didn't get them:
[04/25/2015 14:54:01.099.069] info/communication        Unknown request: NodeId(ns=0, i=787)
[04/25/2015 14:54:01.099.699] info/communication        The message was not entirely processed, skipping to the end

but UaExpert seems to reconnect every time this error shows up with logs:
Connection status of server 'Lime2' changed to 'NewSessionCreated'.
Connection status of server 'Lime2' changed to 'Connected'.
[uastack] OpcUa_Channel_ResponseAvailable: Decoding failed! (0x80B00000)
Connection status of server 'Lime2' changed to 'ConnectionErrorApiReconnect'.

if I browse I get some more errors even though the browse seems to work eventually:
Browse failed with error 'BadConnectionClosed'.
Read attributes of node 'NS1|String|the.answer' failed [ret = BadEndOfStream].



Did anyone notice similar behaviour?

Sten Grüner

unread,
Apr 25, 2015, 11:49:31 AM4/25/15
to open...@googlegroups.com
Hi Peter,

thank you for the feedback!

the server error Unknown request: NodeId(ns=0, i=787) is okay and *intended*: the client tries to create a subscription, and we do not support them yet.

1)
I can not reproduce the UaExpert log error right now, could you please state which uaexpert version. But the connection with UaExpert works as expected?

2)
this seems to be a network layer issue, do you use debian? Could you capture and upload a wireshark dump of the communcation so we can make sure, the connection is dropped after the 787 request? https://github.com/acplt/open62541/wiki/Capturing-traffic-with-Wireshark

Best Regards
--
You received this message because you are subscribed to the Google Groups "open62541" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open62541+...@googlegroups.com.
To post to this group, send email to open...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/open62541/2078dc67-e881-4018-b5dd-c55b39b5862b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


-- 
Sten Grüner
Chair of Process Control Engineering
RWTH Aachen University

Turmstrasse 46, 52064 Aachen, Germany
Tel. +49 (0) 241 80-97745
Fax  +49 (0) 241 80-92238

www.plt.rwth-aachen.de 

Peter Speybrouck

unread,
Apr 25, 2015, 12:45:01 PM4/25/15
to open...@googlegroups.com
Ah, I was indeed trying to subscribe for updates.

UaExpert 1.3.0 201 (it prompted me to update but didn't do that yet)

However, I can't seem to reproduce the 0x80050000 errors now. The connection is working and I can browse the server and see the values.

On the Lime2 device I am indeed using debian Jessie.
The errors I'm getting are not related to the subscription request. Just connecting to it already causes reconnections.
I can still browse but it's just reconnecting all the time.

Logs from the last attempt (read bottom to top), the server log didn't show any errors or warnings this time.

18:38:20.333 | Server Node        | Lime2                          | Disconnect succeeded.
18:38:20.329 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'Disconnected'.
18:38:16.600 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'Connected'.
18:38:16.600 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'NewSessionCreated'.
18:38:11.007 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'ConnectionErrorApiReconnect'.
18:38:11.005 | General            |                                | [uastack] OpcUa_Channel_ResponseAvailable: Decoding failed! (0x80080000)
18:38:07.321 | Attribute Plugin   | Lime2                          | Read attributes of node 'NS1|Numeric|76' failed [ret = BadEndOfStream].
18:38:07.320 | Reference Plugin   | Lime2                          | Browse succeeded.
18:38:07.320 | General            |                                | [uastack] OpcUa_Channel_ResponseAvailable: Decoding failed! (0x80B00000)
18:38:04.262 | AddressSpaceModel  | Lime2                          | Browse succeeded.
18:38:04.258 | TypeCache          | Lime2                          | InverseName = OrganizedBy
18:38:04.258 | TypeCache          | Lime2                          | Description = Organizes
18:38:04.258 | TypeCache          | Lime2                          | DisplayName = Organizes
18:38:04.258 | TypeCache          | Lime2                          | BrowseName = 0:Organizes
18:38:04.258 | TypeCache          | Lime2                          | Read succeeded.
18:38:04.254 | TypeCache          | Lime2                          | Reading type info of NodeId NS0|Numeric|35
18:38:04.254 | TypeCache          | Lime2                          | InverseName = TypeDefinitionOf
18:38:04.254 | TypeCache          | Lime2                          | Description = HasTypeDefinition
18:38:04.254 | TypeCache          | Lime2                          | DisplayName = HasTypeDefinition
18:38:04.254 | TypeCache          | Lime2                          | BrowseName = 0:HasTypeDefinition
18:38:04.254 | TypeCache          | Lime2                          | Read succeeded.
18:38:04.249 | Attribute Plugin   | Lime2                          | Read attributes of node 'NS0|Numeric|84' failed [ret = BadEncodingLimitsExceeded].
18:38:04.248 | TypeCache          | Lime2                          | Reading type info of NodeId NS0|Numeric|40
18:38:04.248 | Reference Plugin   | Lime2                          | Browse succeeded.
18:38:04.248 | General            |                                | [uastack] OpcUa_Channel_ResponseAvailable: Decoding failed! (0x80080000)
18:38:04.248 | AddressSpaceModel  | Lime2                          | Browse succeeded.
18:38:00.840 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'Connected'.
18:37:55.239 | AddressSpaceModel  | Lime2                          | Browse failed with error 'BadConnectionClosed'.
18:37:55.235 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'ConnectionErrorApiReconnect'.
18:37:55.234 | General            |                                | [uastack] OpcUa_Channel_ResponseAvailable: Decoding failed! (0x80080000)
18:37:55.231 | AddressSpaceModel  | Lime2                          | Browse succeeded.
18:37:55.221 | Server Node        | Lime2                          | Revised values: SessionTimeout=10000, SecureChannelLifetime=600000
18:37:55.220 | Server Node        | Lime2                          | Successfully connected UA server.
18:37:55.219 | Server Node        | Lime2                          | Connection status of server 'Lime2' changed to 'Connected'.
18:37:54.862 | Server Node        | Lime2                          | The server returned no certificate, all certificate checks will be skipped.
18:37:54.858 | Server Node        | Lime2                          | ApplicationUri: 'urn:unconfigured:open62541:open62541Server'
18:37:54.855 | Server Node        | Lime2                          | Found security policy 'http://opcfoundation.org/UA/SecurityPolicy#None'
18:37:54.850 | Server Node        | Lime2                          | Found endpoint 'opc.tcp://lime2:16664'


I will see if I can grab a wireshark capture.

Peter Speybrouck

unread,
Apr 25, 2015, 12:59:12 PM4/25/15
to open...@googlegroups.com
In attachment a capture (made on windows, the machine that's running UaExpert) and the corresponding UaExpert logs.

I hope it helps.
capture.zip

Julius Pfrommer

unread,
Apr 25, 2015, 8:40:37 PM4/25/15
to open...@googlegroups.com
Hi,

Sten found the issue in the wireshark dump. On TCP retransmissions (in lossy or congested network), we did not properly retry after the EAGAIN error code.
https://github.com/acplt/open62541/commit/a61c12e87453c4fc141882845bbc3b2aae7b948a

Best, Julius

Peter Speybrouck

unread,
Apr 26, 2015, 6:05:30 AM4/26/15
to open...@googlegroups.com
Thanks for looking into it.

I pulled the latest commit, but the issue seems to be still there (logs/capture attached)
I have the impression that browsing the address space seems to be triggering this, but I do get to see the nodes under "Objects" eventually.
However, I don't get to see the details in the Attributes window of UaExpert.

Also, when pressing Ctrl+C to stop the example server, I get this:

^CReceived Ctrl-C
*** Error in `./exampleServer': free(): invalid pointer: 0x01851cd4 ***
Aborted

(the above is with exampleServer on Lime2)
I also upgraded UaExpert to 1.3.1 but no change.
capture2.zip

Sten Grüner

unread,
Apr 26, 2015, 6:19:01 AM4/26/15
to open...@googlegroups.com
Hmm, yes your impression is correct, since the faulty packets comes in a read-response.

Would you be so kind and test the tagged releases, RC1, RC2 and RC3.3 https://github.com/acplt/open62541/tags just want to make sure there is no regression.

Best Regards
Sten
--
You received this message because you are subscribed to the Google Groups "open62541" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open62541+...@googlegroups.com.
To post to this group, send email to open...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Peter Speybrouck

unread,
Apr 26, 2015, 4:36:05 PM4/26/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
if I check out the tagged releases:
  checkout tags/<tag_name>
and run make again, I still have the same behaviour (disconnecting and not able to see the attributes), so I guess it's not a regression.

However, I tested it now on a raspberry pi as well and here it seems to work without a problem.

So perhaps there is an issue with the debian (jessie) distribution or drivers from the olinuxino lime2.
I'm using this image (3.4 kernel, jessie) http://www.igorpecovnik.com/2014/11/18/olimex-lime-debian-sd-image/

On the lime2, I have the problem with both the ethernet and the wifi connection so it must be a more general problem than one specific network driver.

or could one of these messages on startup of the exampleServer explain why this problem occurs?
[04/26/2015 17:59:29.904.408] warning/userland  [Linux specific] Can not open temperature file, no temperature node will be added
[04/26/2015 17:59:29.904.871] warning/userland  [Raspberry Pi specific] Can not open trigger or LED file (try to run server with sudo if on a Raspberry PI)
[04/26/2015 17:59:29.905.176] warning/userland  An LED node will be added but no physical LED will be operated

Sten Grüner

unread,
Apr 27, 2015, 2:03:39 AM4/27/15
to open...@googlegroups.com
Hi Peter,

no the userland warnings do not harm.
The first one just says that the file /sys/class/thermal/thermal_zone0/temp does not exist, this one is usually present linux systems and contains the temperature of the CPU.
The second one says that the file /sys/class/leds/led0/brightness does not exist, that is used on raspberry pi to controll a led.
These two do not harm the communication.

We also test our code on RPIs and another ARM devices e.g. WAGO-750-860 and I haven't seen such a network problem before.

Best Regards
Sten

Peter Speybrouck

unread,
Apr 27, 2015, 4:41:24 AM4/27/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
What is the nature of the networking issues here?
bad data sent by client or server?
Missing data?

Is there extra debugging/logging that I can enable to determine where the problems originate from?
I'm not familiar with the code yet, but if I can help with figuring this out.

Sten Grüner

unread,
Apr 27, 2015, 10:42:27 AM4/27/15
to open...@googlegroups.com
Hello Peter,

I fear there is no more debugging that can be turned on.

The problem is that the client sends a corrupted ReadReply message. You can trace it in wireshark, there a [malformed message] is written nest to the particular packet.

It seems that the buffer that is going to be sent over the wire gets corrupted at some point (the beggining of the reply is okay, but at some point it seems to send some random data). Consider packet 106 in your first capture.

 In particular you can see it at the last integer of the message - Array of DiagnosticInfos.ArraySize. There it is set to some random values like something like 1801, but it should be -1.

The service is invoked here: https://github.com/acplt/open62541/blob/master/src/server/ua_server_binary.c#L307

The answer will be contained in UA_ReadResponse r and encoded in ByteString message.

The buffer is handed over to the tcp stack here: https://github.com/acplt/open62541/blob/master/src/server/ua_server_binary.c#L372.

So first I'd investigate whether the buffer is okay at this point, you have to compare it to the bytestring captured in wireshark. If it is malformed at this point, we have some problem in the service call (which i do not believe ;), if they are different, than the problem is in the network layer, kernel, driver etc.

HTH
Sten

Peter Speybrouck

unread,
Apr 27, 2015, 3:03:09 PM4/27/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
Hi,

with your hints, I added some silly printf statements to print the message bytes (see attachment) just before this statement:
connection->write(connection, &message);

In the log I also inserted the corresponding bytes captured by wireshark.

This seems to indicate that the encoded ByteString message is already corrupt before sending it. This at least excludes some funky driver or linux problems.
I will try and see if I can figure out where this goes wrong.

I guess a new issue for this can be opened...
log2.txt

Sten Grüner

unread,
Apr 27, 2015, 4:46:25 PM4/27/15
to open...@googlegroups.com
hmm... looking into your wireshark captures.... it seems that somethig goes wrong after a timestamp... if no timestamps are in the reply - everything is fine. So... timestamps are Ua_UInt64... which are the only one that are encoded...

could you execute this somewhere?

printf("%d\n", sizeof(UA_UInt64));

just to make sure....

could you also run the server in valgrind? "valgrind ./exampleServer" to make sure there is no illegal memory access?

Best Regards
Sten

Peter Speybrouck

unread,
Apr 29, 2015, 1:21:22 PM4/29/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
sizeof(UA_UInt64) = 8, just like on Raspberry PI (first edition B).

I never used valgrind before, but strangly I cannot reproduce the issue with valgrind. I can browse the address space, no issue with timestamps.
Then I run it again without valgrind and I get problems again.

This is the output from valgrind:
==30872==
==30872== HEAP SUMMARY:
==30872==     in use at exit: 9,600 bytes in 13 blocks
==30872==   total heap usage: 7,066 allocs, 7,053 frees, 2,654,623 bytes allocated
==30872==
==30872== LEAK SUMMARY:
==30872==    definitely lost: 9,600 bytes in 13 blocks
==30872==    indirectly lost: 0 bytes in 0 blocks
==30872==      possibly lost: 0 bytes in 0 blocks
==30872==    still reachable: 0 bytes in 0 blocks
==30872==         suppressed: 0 bytes in 0 blocks
==30872== Rerun with --leak-check=full to see details of leaked memory
==30872==
==30872== For counts of detected and suppressed errors, rerun with: -v
==30872== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


if I run it with --leak-check=full:

==30902==
==30902== HEAP SUMMARY:
==30902==     in use at exit: 4,320 bytes in 6 blocks
==30902==   total heap usage: 4,388 allocs, 4,382 frees, 1,782,868 bytes allocated
==30902==
==30902== 1,440 bytes in 3 blocks are definitely lost in loss record 1 of 2
==30902==    at 0x483E394: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)
==30902==
==30902== 2,880 bytes in 3 blocks are definitely lost in loss record 2 of 2
==30902==    at 0x4840CE8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)
==30902==
==30902== LEAK SUMMARY:
==30902==    definitely lost: 4,320 bytes in 6 blocks
==30902==    indirectly lost: 0 bytes in 0 blocks
==30902==      possibly lost: 0 bytes in 0 blocks
==30902==    still reachable: 0 bytes in 0 blocks
==30902==         suppressed: 0 bytes in 0 blocks
==30902==
==30902== For counts of detected and suppressed errors, rerun with: -v
==30902== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Sten Grüner

unread,
Apr 30, 2015, 2:09:51 AM4/30/15
to open...@googlegroups.com
Hello Peter,

I am sorry to hear thath. Well, valgrind basically replaces malloc, realloc, and free of libstdc and logs the memory alocation e.g. to trace if memory is leaking.

Valgrind shows we have some memory leaks, but it is not a problem for wrong buffers. Actually, with --leak-check=full, it should also show where the memory is leaking. Do you have debugging symbols enabled in the binary you run in valgrind?

Actually, I'd expect valgrind to show some memory wrong access, unitialized variables being accessed or something similar...

3 points to investigate:
- check if there are memleaks on rpi (I was actually not aware we have some left)
- try to trace them on lime (check for debug symobls), try cmake -DCMAKE_BUILD_TYPE=Debug ..
- try another compiler/libstdc on lime e.g. by changing the distribution

HTH
Sten
--
You received this message because you are subscribed to the Google Groups "open62541" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open62541+...@googlegroups.com.
To post to this group, send email to open...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Peter Speybrouck

unread,
May 14, 2015, 7:54:06 AM5/14/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
Hi,

I have been doing some more tests (I posted something about it on IRC).
On Raspberry PI, I cannot reproduce the problem.

I also ran the testsuite on Lime2 which turns up 2 errors:

Running suite(s): Built-in Data Types 62541-6 Table 1 97%: Checks: 77, Failures: 2, Errors: 0 /root/test/open62541/tests/check_builtin.c:1132:F:encode:UA_DataValue_encodeShallWorkOnExampleWithoutVariant:0: Assertion 'pos==9' failed: pos==11, 9==9 /root/test/open62541/tests/check_builtin.c:1169:F:encode:UA_DataValue_encodeShallWorkOnExampleWithVariant:0: Assertion 'pos==1+(1+4)+8' failed: pos==16, 1+(1+4)+8==14

I did some check on type sizes but there does not seem to be a difference with Raspberry PI.

UA_UIint64: 8
UA_Boolean: 1
UA_Variant: 24
UA_StatusCode: 4
UA_DateTime: 8
UA_Int16: 2

Could it be an empty array taking an extra byte or so?

Anyway, I tried building with debug symbols: cmake -DCMAKE_BUILD_TYPE=Debug
Surprisingly, the problem and test suite  errors disappear with this debug mode, just like when running it with valgrind, weird...

If there are some checks I can add to those 2 tests to idenify where the difference is coming from, let me know.

Peter Speybrouck

unread,
May 14, 2015, 10:34:59 AM5/14/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
I messed a little bit with that second test and compared with the raspberry pi.
It seems that adding the server timestamp adds 2 extra bytes for some reason.

In this test:
START_TEST(UA_DataValue_encodeShallWorkOnExampleWithVariant) {
        // given
        UA_DataValue src;
        UA_DataValue_init(&src);
        src.serverTimestamp    = 80;
        src.hasValue = UA_TRUE;
        src.hasServerTimestamp = UA_TRUE;
        src.value.type = &UA_TYPES[UA_TYPES_INT32];
        src.value.arrayLength  = -1; // one element (encoded as not an array)
        UA_Int32  vdata  = 45;
        src.value.data = (void *)&vdata;


if I change
        src.serverTimestamp    = 80;
        src.hasServerTimestamp = UA_TRUE;
to
        //src.serverTimestamp    = 80;
        src.hasServerTimestamp = UA_FALSE;

the reported length changes to 6 on both machines.
Adding the server timestamp bumps the length up to 16 on the Lime2 instead of the expected 14 for the extra 8 bytes.


After digging a little deeper, I must admit I am really stumped by this issue...
I checked out the latest code from master.
I ran: cmake -DBUILD_EXAMPLESERVER=ON -DBUILD_UNIT_TESTS=ON   ..
Running build/tests/check_builtin results in the 2 errors I posted earlier.

With one thing I can eliminate or reintroduce the errors.
In src/ua_type_encoding_binary.c, method UA_DataValue_encodeBinary:

if I replace this part:
    if(src->hasServerTimestamp)
        retval |= UA_DateTime_encodeBinary(&src->serverTimestamp, dst, offset);

with this silly edit:
    if(src->hasServerTimestamp)
    {
        printf("wtf??\n");
        retval |= UA_DateTime_encodeBinary(&src->serverTimestamp, dst, offset);       
    }

(or with the printf underneath the UA_DateTime_encodeBinary line)
This silly addition of the printf statement eliminates the problem. If I put the printf statement anywhere else in the UA_DataValue_encodeBinary method, the problem persists, but the above edit seems to eliminate the problem.

I can simulate the same behaviour using sprintf (to avoid having to print something).

really strange, I'm guessing there might indeed be an issue with te compiler/C library, but I have no idea how to identify the specific problem :-(

Peter

Grüner, Sten

unread,
May 14, 2015, 11:09:39 AM5/14/15
to open...@googlegroups.com
Hey Peter,

Is there a bugtracker for lime's toolchain?

This seems to be really weird... Just a quick guess:

Setting build type to Debug has one side-effect - it disables gcc optimizations (while Release build type sets it to -O2).

Just a guess, is it some optimization that breaks the code? You got to find out which one.

does it work for this one?
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_FLAGS=-O1 ..

if yes, you just have to continue adding optimization flags one by one until you find out which one breaks the code.

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

HTH
Sten


Von: Peter Speybrouck [peter.sp...@gmail.com]
Gesendet: Donnerstag, 14. Mai 2015 16:34
An: open...@googlegroups.com
Cc: Grüner, Sten
Betreff: Re: [open62541] Re: errors when connecting with UaExpert

Peter Speybrouck

unread,
May 14, 2015, 1:42:42 PM5/14/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
I'm not sure if there is anything specific about the toolchain. (Dual core Cortex-A7, Allwinner A20 cpu)

I'm using the 3.4 jessie kernel from this page (though not the latest build):
http://www.igorpecovnik.com/2014/11/18/olimex-lime-debian-sd-image/

it should be standard debian repository:
uname -a
Linux lime2 3.4.105-lime2 #4 SMP PREEMPT Fri Jan 23 02:52:19 EST 2015 armv7l GNU/Linux

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/4.9/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 4.9.2-10' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 4.9.2 (Debian 4.9.2-10)


optimizations:
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_FLAGS=-O1 -DBUILD_EXAMPLESERVER=ON -DBUILD_UNIT_TESTS=ON ..
=> works
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_FLAGS=-O2 -DBUILD_EXAMPLESERVER=ON -DBUILD_UNIT_TESTS=ON ..
=> fails

So I started adding the optimizations one by one in the order they appear on that website (except a few that were not supported).
Once I got to -finline-small-functions, I got a compile error, so I left it out to test the rest.

This one still works:
cmake -DCMAKE_BUILD_TYPE=Debug -DBUILD_EXAMPLESERVER=ON -DBUILD_UNIT_TESTS=ON -DCMAKE_C_FLAGS="-O1 -fthread-jumps -falign-functions -falign-jumps -falign-loops -falign-labels -fcaller-saves -fcrossjumping -fcse-follow-jumps -fcse-skip-blocks -fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fexpensive-optimizations -fgcse -fgcse-lm -fhoist-adjacent-loads -findirect-inlining -fipa-cp -fipa-sra -fisolate-erroneous-paths-dereference -foptimize-sibling-calls -foptimize-strlen -fpartial-inlining -fpeephole2 -freorder-blocks -freorder-functions -frerun-cse-after-loop -fsched-interblock -fsched-spec -fschedule-insns -fschedule-insns2 -fstrict-aliasing -fstrict-overflow -ftree-builtin-call-dce -ftree-switch-conversion -ftree-tail-merge -ftree-pre -ftree-vrp" ..


Adding that -finline-small-functions and the problem appears.

It might be related to some other optimization as well, since it did not compile when only adding these optimizations:
-fthread-jumps -falign-functions  -falign-jumps -falign-loops  -falign-labels -fcaller-saves -fcrossjumping -fcse-follow-jumps -fcse-skip-blocks -fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fexpensive-optimizations -fgcse  -fgcse-lm  -fhoist-adjacent-loads -finline-small-functions

/root/open62541/src/ua_types_encoding_binary.c: In function ‘UA_decodeBinary’:
/root/open62541/src/ua_types_encoding_binary.c:1138:29: error: ‘tempNoElements’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
                 *noElements = tempNoElements;
                             ^
cc1: all warnings being treated as errors
CMakeFiles/open62541-object.dir/build.make:100: recipe for target 'CMakeFiles/open62541-object.dir/src/ua_types_encoding_binary.c.o' failed
make[2]: *** [CMakeFiles/open62541-object.dir/src/ua_types_encoding_binary.c.o] Error 1


I'll start googling a little bit on that -finline-small-functions flag.

Peter

Peter Speybrouck

unread,
May 14, 2015, 2:12:12 PM5/14/15
to open...@googlegroups.com
I did find this page though:
http://linux-sunxi.org/Toolchain

Grüner, Sten

unread,
May 14, 2015, 4:51:48 PM5/14/15
to Peter Speybrouck, open...@googlegroups.com
well, maybe there is this uninitialised problem? this would explain there heisenbugs...

https://github.com/acplt/open62541/blob/master/src/ua_types_encoding_binary.c#L1083

just change this line to

UA_Int32 tempNoElements = 0;

Best Regards
Sten


Von: open...@googlegroups.com [open...@googlegroups.com]" im Auftrag von "Peter Speybrouck [peter.sp...@gmail.com]
Gesendet: Donnerstag, 14. Mai 2015 20:12
An: open...@googlegroups.com
Betreff: [open62541] Re: errors when connecting with UaExpert

I did find this page though:
http://linux-sunxi.org/Toolchain

--
You received this message because you are subscribed to the Google Groups "open62541" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open62541+...@googlegroups.com.
To post to this group, send email to open...@googlegroups.com.

Peter Speybrouck

unread,
May 14, 2015, 6:20:59 PM5/14/15
to open...@googlegroups.com, s.gr...@plt.rwth-aachen.de
Sten,

Thanks, that does fix the compile error but not the problem.


It must be a combination of -finline-small-functions and some other optimizations.
These combined with at least one more optimization is causing the problem. Leave one out and the problem goes away:
-fstrict-aliasing -finline-small-functions -fschedule-insns -fschedule-insns2 -fpeephole2

Combined the problem with UA_DateTime_encodeBinary appears.

I'll just go with this solution for now, but it still remains a very strange bug. Heisenbug seems indeed appropriate :-P

cmake -DCMAKE_BUILD_TYPE=Debug -DBUILD_EXAMPLESERVER=ON -DBUILD_UNIT_TESTS=ON -DCMAKE_C_FLAGS="-fno-inline-small-functions" ..
Reply all
Reply to author
Forward
0 new messages