Content type and encoding

38 views
Skip to first unread message

Lionel Cons

unread,
Jun 16, 2010, 5:49:34 AM6/16/10
to stomp...@googlegroups.com
This already started in the "Broker Neutrality" thread but I think it
deserves its own thread.

Let's leave aside header encoding (discussed in an other thread), what
about body type and encoding?

There are several separate issues that we want to address:
(1) get rid of the the "content-length unset <=> Text Message" hack
(2) allow the mapping to more JMS type (besides Text and Bytes)
(3) allow producers to better specify what they send

Do you see other related problems?

For me, the fundamental question is: does the messaging infrastructure
need to know the body type and encoding or not?

For (3), the answer is no. If the body is of type YAML and gzip
encoded, the broker should not care.

For (1) and (2) the broker may care (if it is a JMS broker). Worse,
with encoding, the broker may have to decode it at some point.

Do we need separate headers for (1)/(2) and (3)?

Cheers,

Lionel

Ian Eccles

unread,
Jun 16, 2010, 9:08:20 AM6/16/10
to stomp-spec
I'm in favor of a "content-type" header to remove the "content-length"
hack in ActiveMQ. Supporting the full range of MIME types is probably
overkill if the goal of the header is to allow the broker to make
informed decisions about how to type the message internally (or the
type presented through other interfaces, such as JMS.)

From a survey of message types from existing brokers, we could produce
a set of allowed 'content-type' values to address (2). Although
adding too many type options will probably only serve to complicate
the spec while offering only marginal improvements as I don't expect
all brokers are going to attempt to implement all types for their
STOMP interface. For instance, a non-Java broker is probably not
going to bother handling JMS ObjectMessage types whose payload is a
serialized Java object. Maybe it's worth considering defining a few
'content-type' values that MUST be handled, but allow brokers to
additionally utilize their own types. We can guarantee a portable
spec while giving clients and brokers some latitude in more efficient
message representations when portability isn't their priority.

As for (3), I agree that the broker should not care if the content is
gzipped YAML, javascript or a pdf file. For that reason, maybe using
'content-type' to indicate the message type is inappropriate, given
its existing connotations in other protocols. The key 'message-type'
may be more appropriate for these purposes, with the added benefit
that if the producers / consumers of messages want to know something
about the body's message type, they can use the more familiar 'content-
type' header without interfering with the message brokers.

I have no strong opinions on the actual encoding to use for message
bodies; however, I imagine this is pretty important for brokers like
ActiveMQ that discriminate between binary messages and text messages.

If a 'message-type' header is agreed upon, I'd like to suggest making
the 'content-length' header mandatory in future STOMP specifications.
Much like the "header_key: value" != "header_key:value" discussion
raised in another thread, I think the only entities serviced by not
including the 'content-length' header are people using telnet to play
around with a broker. Scanning each incoming byte of a body in search
of the "NULL byte" certainly has to affect broker performance, and
depending upon the body's character encoding, a single "NULL byte" may
be part of a single character's encoding and not the end of the
message. Granted, this shouldn't occur in utf8 encoding.
Reply all
Reply to author
Forward
0 new messages