Problem with xmpp_stanza_get_text()

80 views
Skip to first unread message

Robert

unread,
Jul 15, 2014, 12:03:42 PM7/15/14
to libst...@googlegroups.com
The return string from xmpp_stanza_get_text() is curtailed by certain characters.   Examples:

Remote user types: foo_"bar"
libstrophe receives: <body>foo_&quot;bar&quot;</body>
xmpp_stanza_get_text() returns: foo_

Remote user types: foo_'bar'
libstrophe receives: <body>foo_'bar'</body>
xmpp_stanza_get_text() returns: foo_

Remote user types: foo&bar
libstrophe receives: <body>foo&amp;bar</body>
xmpp_stanza_get_text() returns: foo

I guess this is a bug, but being unfamiliar with the internals of libstrophe it might take me a while to locate the problem myself.   Can anyone offer any assistance?

Matthew Wild

unread,
Jul 15, 2014, 1:46:35 PM7/15/14
to libst...@googlegroups.com
On 15 July 2014 17:03, Robert <brownb...@googlemail.com> wrote:
> Remote user types: foo&bar
> libstrophe receives: <body>foo&amp;bar</body>
> xmpp_stanza_get_text() returns: foo
>
> I guess this is a bug, but being unfamiliar with the internals of libstrophe
> it might take me a while to locate the problem myself. Can anyone offer
> any assistance?

A shot in the dark: any chance there are multiple text nodes in the tag?

Regards,
Matthew

Robert

unread,
Jul 15, 2014, 2:46:22 PM7/15/14
to libst...@googlegroups.com
Hi Matthew.   What would that look like?   Here's the full incoming string for foo&bar, as displayed in the log, but with JIDs anonymised:

xmpp DEBUG RECV: <message id="purple9a31d1a4" to="local@localserver" type="chat" from="remote@remoteserver/1bc6732902
 1b832e"><active xmlns="http://jabber.org/protocol/chatstates"/><body>foo&amp;bar</body></message>

I can't see anything there I don't see with messages that don't get truncated.   It would seem that ' and & have some special meaning to libstrophe or expat.   There might be a clue in that &amp; would need to get converted back to & at some stage, but '?

Matthew Wild

unread,
Jul 15, 2014, 3:48:31 PM7/15/14
to libst...@googlegroups.com
On 15 July 2014 19:46, Robert <brownb...@googlemail.com> wrote:
> Hi Matthew. What would that look like?

The actual XML wouldn't look any different. But some parsers may emit
text in multiple chunks. For example it might pass to libstrophe
"foo", then "&", then "bar". These are stored as multiple children,
just like you can have multiple tags as children.

However I see in strophe.h a comment that says:

/* concatenate all child text nodes. this function
* returns a string that must be freed by the caller */
char *xmpp_stanza_get_text(xmpp_stanza_t * const stanza);

...so if there are multiple text nodes, they should all be joined
together by this function. Assuming that's working ok, it's not your
issue.

What XML parser is your strophe compiled with?

Regards,
Matthew

Robert

unread,
Jul 15, 2014, 5:14:48 PM7/15/14
to libst...@googlegroups.com
The code for xmpp_stanza_get_text() in stanza.c does indeed seems to concatenate child text nodes.   I can't see anything there that would be thrown by characters given a special meaning, so the problem must lie elsewhere.

I'm using expat for the XML parser.

Robert

unread,
Jul 16, 2014, 6:53:38 AM7/16/14
to libst...@googlegroups.com
I've figured it out.   Libstrophe doesn't in fact concatenate the text elements, despite what the initial investigation revealed, so I have to go through all the children of <body> and concatenate the results in my code.   Many thanks Matthew, your comments pointed me in the right direction.

All the best,

Robert.

Faraz Khan

unread,
Oct 14, 2014, 10:42:34 PM10/14/14
to libst...@googlegroups.com
Robert,
I'm experiencing the exact same issue - did you just repeat the same logic thats already inside xmpp_stanza_get_text (it seems to already loop through the children as you noticed) or did you do something different to get the full text?

Thanks!

Robert

unread,
Oct 15, 2014, 3:55:29 AM10/15/14
to libst...@googlegroups.com
Hi Faraz,

Here's the relevant code snippet:

// Work through the text children, concatenating them

pstanzaBody = xmpp_stanza_get_children(pstanzaBody);
while (pstanzaBody)
{
    if (xmpp_stanza_is_text(pstanzaBody))
    {
        pszBody = xmpp_stanza_get_text(pstanzaBody);
        if (pszBody)
        {
            strMsg += pszBody;
            xmpp_free(pClient->m_pStropheContext, pszBody);
        }
    }
    pstanzaBody = xmpp_stanza_get_next(pstanzaBody);
}

strMsg is an MFC CString, so strMsg += pszBody just appends pszBody (the child text) to strMsg (the complete message text - eventually).   strMsg starts off as an empty string (if that isn't obvious).

Faraz Khan

unread,
Oct 15, 2014, 1:48:54 PM10/15/14
to libst...@googlegroups.com
Robert,
Very awesome - works like a charm! Thanks for saving countless hours of my time! :D
Reply all
Reply to author
Forward
0 new messages