POE::Filter::Stomp: Problems with Windows-style newlines

26 views
Skip to first unread message

David Snopek

unread,
Apr 6, 2010, 10:08:19 AM4/6/10
to Naveed Massjouni, pocomq, Kevin Esteb
2010/4/5 Naveed Massjouni <nave...@gmail.com>:
> Yes, it seems like my guess was correct and the line endings were the
> issue :) I agree that we should try to make pocomq handle windows
> style line endings.  There is some saying about being flexible about
> what you accept and strict about what you send and I think it applies
> here.

This appears to be a problem with POE::Filter::Stomp. Looking at
POE/Filter/Stomp.pm line 28:

my $eol = qr(\012\015?);

I believe those are reversed. It should be qr(\015?\012) because the
Windows newline is CR+LF not LF+CR. Its sort-of curious that they are
specified in octal, but I guess thats personal preference. ;-)

I'm CC:'ing this e-mail to the list and Kevin Esteb.

Thanks again for this bug report!
David Snopek.

Kevin Esteb

unread,
Apr 6, 2010, 11:36:23 AM4/6/10
to David Snopek, Naveed Massjouni, pocomq
I don’t think that is the problem. The $eol is a reqex that is used to find the EOL. Which the STOMP protocol defines as "newline". Which is wonderfully vague, because depending on OS and/or C runtime this can be just about anything (i.e. CR, NL, CR+NL, NL+CR, etc...). Too bad they didn’t follow this document: http://www.rfc-editor.org/EOLstory.txt; and adopt traditional practice for ASCII protocols.

So, yes, POE::Filter::Stomp is trying to be lenient when receiving packets (i.e. any combination of NL & CR), but strict when sending them (i.e. only NL).

Please submit a test to POE::Filter::Stomp that shows the problem.

Thanks

Kevin

Naveed Massjouni

unread,
Apr 6, 2010, 11:58:29 AM4/6/10
to Kevin Esteb, David Snopek, pocomq
qr(\012\015?) is the same as qr/\n\r?/. So this would match "\n" and
"\n\r". Windows used "\r\n".

in _parse_frame you do the following:
$self->{buffer} =~ s/^(.+?)$eol//s
my $command = $1;

So from a telnet session (which uses "\r\n" line endings) when someone types:
CONNECT<ENTER>
then $command will be "CONNECT\r" and this is what is causing the problems.

I believe that $eol should be set like this:
$eol = "\r?\n"

Thanks,
Naveed

David Davis

unread,
Apr 6, 2010, 12:20:48 PM4/6/10
to poc...@googlegroups.com, Kevin Esteb, David Snopek
What you really need is:
qr/(\\x0D\\x0A?|\\x0A\\x0D?)/

David Davis
☄ Software Engineer
http://xant.us/
http://xantus.tel/


--
You received this message because you are subscribed to the Google Groups "PoCo::MQ" group.
To post to this group, send email to poc...@googlegroups.com.
To unsubscribe from this group, send email to pocomq+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pocomq?hl=en.


David Davis

unread,
Apr 6, 2010, 12:22:04 PM4/6/10
to poc...@googlegroups.com, Kevin Esteb, David Snopek
Sorry about that, the slashes were escaped.  Correct version:
qr/(\x0D\x0A?|\x0A\x0D?)/
David Davis
☄ Software Engineer
http://xant.us/
http://xantus.tel/


Kevin Esteb

unread,
Apr 6, 2010, 12:49:39 PM4/6/10
to David Davis, poc...@googlegroups.com, David Snopek

I see the problem. Telnet uses the traditional ASCII protocol definition of EOL which is CR+NL. The STOMP protocol defines EOL as “newline”, which has different meanings depending on OS and/or C runtime.  Both processes are doing the right thing, but are ignoring each other in the process.

 

To make POE::Filter::Stomp neutral in this regard, a regex needs to be developed that handles all the possible combinations of EOL and a test to show that it really works.  

 

Something like $eol = qr/(\x0D\x0A?|\x0A\x0D?|\x0D?|\x0A?)/ might work.

 

Another question is,  is this really necessary. Telnet is not STOMP, so why would you expect them to work together?

Naveed Massjouni

unread,
Apr 6, 2010, 1:12:10 PM4/6/10
to poc...@googlegroups.com, David Davis, David Snopek, kes...@wsipc.org
I believe the stomp protocol is ambiguous with regard to what exactly
a newline is:
http://stomp.codehaus.org/Protocol
The examples on that site look exactly like telnet sessions. In
telnet, ^@ sends a null character.

This is not only a telnet issue. NMS, the apache .NET stomp client
uses "\r\n" when sending stomp frames.
http://activemq.apache.org/nms/nms.html

One of the subsystems at my work uses C# and they are using NMS to hit
our messaging service. That is how this issue was found.

Naveed

Kevin Esteb

unread,
Apr 6, 2010, 4:14:35 PM4/6/10
to poc...@googlegroups.com, David Davis, David Snopek
It would be interesting to know just how your library is creating the packets, does it have problems with receiving packets?

Working on a solution.

Naveed Massjouni

unread,
Apr 8, 2010, 12:50:13 PM4/8/10
to poc...@googlegroups.com, David Davis, David Snopek
The C# client is using \r\n to separate lines in stomp frames when
sending. It first tries to send a CONNECT frame. The problem is that
it can't even CONNECT. I believe the issue is that poco::MessageQueue
parses "CONNECT\r" from the frame and returns an error, that it
doesn't know what that command is. I hope that answers your question.

Naveed

Kevin Esteb

unread,
Apr 8, 2010, 2:09:35 PM4/8/10
to poc...@googlegroups.com, David Davis, David Snopek
That is part of the problem.

I spent some time, reading some code, from the various implementations of STOMP. That code included the C# library, the Java code for the ActiveMQ STOMP connector, a ruby server, a python server, an erlang server and the client stuff on the codehaus site. All I can say is that I love Perl, a lot of busy work in that code, to do such simple things.

All the implementations are using "\n" as the EOL. Once that was learned, I then spent some time researching what those languages thought "\n" meant.

The meaning of "\n" depends on the OS and/or language you are using. That C# library translates "\n" info CRLF. Perl on Unix, translates that "\n" into LF. Perl on Windows would translate that "\n" into CRLF, just like the C# library does. The same goes for the Java code, etc, etc... Which is the reason, that 20 some years ago they came up with CRLF as standard for ASCII transmissions across the net. Too bad the STOMP protocol choose to ignore this.

All of those implementations have some sort of native ReadLine()/WriteLine() methods that must be translating the various values of "\n" into a common value for that language. The Perl modules on the codehaus site are broken, so are Net::Stomp and Net::STOMP. They all use "\n" as EOL, which will not work cross platform.

So I "fixed" the regex in POE::Filter::Stomp and fixed a few things that were dependent on "\012" being the EOL. It passes the standard tests on my Linux box, but I wasn't able to write a test to check the various EOL flavors (Perl actively worked against this). So please download it and give it a try. This "fixed" version is located at http://svn.kesteb.us/repos/POE-Filter-Stomp/trunk. If it works for you and hopefully others, I will upload it to CPAN.

Thanks

Naveed Massjouni

unread,
Apr 8, 2010, 2:38:05 PM4/8/10
to poc...@googlegroups.com, David Davis, David Snopek
This fix didn't work for me. I believe the $eol regex is incorrect:

my $eol = qr((\012|\015|\015\012?|\012\015?));

Consider this short program:

my $frame = "fooRN";
my $eol = qr/N|R|RN|NR/;
my ($command, $eol) = $frame =~ /^(.+?)($eol)/;
print "|$command|$eol|";

The output is: |foo|R|

What we really would have wanted it to output is |foo|RN|. It seems
that because R was before RN, it short circuited and didn't capture
the entire RN.

I suggest using this for $eol:

my $eol = qr/\r\n|\n|\r/;

Regards,
Naveed

Naveed Massjouni

unread,
Apr 8, 2010, 2:42:52 PM4/8/10
to poc...@googlegroups.com, David Davis, David Snopek
By the way, when I set my $eol = qr/\r\n|\n|\r/; it works for both
TELNET (which uses \r\n) and the perl client on linux :)

Naveed

Kevin Esteb

unread,
May 10, 2010, 4:41:40 PM5/10/10
to poc...@googlegroups.com, David Davis, David Snopek, Naveed Massjouni
It has been a few weeks now, I have tested this code on a Windows box. It passes the tests and works. I can transfer frames from Windows to Linux, Linux to Windows and variations on the theme. It is also backwards compatible with v0.03.

In a couple of weeks, I will release this as v0.04 to CPAN.

Kevin Esteb

unread,
May 11, 2010, 1:05:12 PM5/11/10
to Naveed Massjouni, David Davis, David Snopek, poc...@googlegroups.com
Yes, you are right, I didn't receive that email. The following code shows that you are correct.

sub ascii_to_hex {

## Convert each ASCII character to a two-digit hex number.
(my $str = shift) =~ s/(.|\n)/sprintf("%02lx", ord $1)/eg;
return $str;
}

sub hex_to_ascii {

## Convert each two-digit hex number back to an ASCII character.
(my $str = shift) =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;
return $str;
}

my $frame;
my $command;
my $eol = qr((\015\012?|\012\015?|\015|\012));

$frame = "foo\r";
($command, $x) = $frame =~ /^(.+?)($eol)/;
printf("|%s|%s|\n", $command, ascii_to_hex($x));

$frame = "foo\r\n";
($command, $x) = $frame =~ /^(.+?)($eol)/;
printf("|%s|%s|\n", $command, ascii_to_hex($x));

$frame = "foo\n\r";
($command, $x) = $frame =~ /^(.+?)($eol)/;
printf("|%s|%s|\n", $command, ascii_to_hex($x));

$frame = "foo\n";
($command, $x) = $frame =~ /^(.+?)($eol)/;
printf("|%s|%s|\n", $command, ascii_to_hex($x));

Which produces the following output.

|foo|0d|
|foo|0d0a|
|foo|0a0d|
|foo|0a|

I have updated the code to reflect this change. I am currently running this modification in production without any problems. So it is also backwards compatible.

Thank you with your help in debugging and suggesting a solution. This updated version will be released to CPAN in a couple of weeks.

-----Original Message-----
From: Naveed Massjouni [mailto:nave...@gmail.com]
Sent: Tuesday, May 11, 2010 1:56 AM
To: Kevin Esteb
Cc: David Davis; David Snopek
Subject: Re: POE::Filter::Stomp: Problems with Windows-style newlines

Kevin, I don't think you got the replies I sent on April 8. I seemed
to have replied all and included both the David's and the pocomq
group, but somehow missed you. Here is a copy of my reply on April
8:

---

This fix didn't work for me. I believe the $eol regex is incorrect:

my $eol = qr((\012|\015|\015\012?|\012\015?));

Consider this short program:

my $frame = "fooRN";
my $eol = qr/N|R|RN|NR/;
my ($command, $eol) = $frame =~ /^(.+?)($eol)/;
print "|$command|$eol|";

The output is: |foo|R|

What we really would have wanted it to output is |foo|RN|. It seems
that because R was before RN, it short circuited and didn't capture
the entire RN.

I suggest using this for $eol:

my $eol = qr/\r\n|\n|\r/;

I have tested this and when I set my $eol = qr/\r\n|\n|\r/; it works for both
TELNET (which uses \r\n) and the perl client on linux.

---

Thanks,
Naveed
Reply all
Reply to author
Forward
0 new messages